Improving Pre-trained CNN-LSTM Models for Image Captioning with Hyper-Parameter Optimization

Nuha M. Khassaf; Nada Hussein M. Ali

doi:10.48084/etasr.8455

Details

Publication Date

Wed Oct 09 2024

Journal Name

Engineering, Technology & Applied Science Research

Volume

14

Issue Number

5

DOI

10.48084/etasr.8455

Choose Citation Style

Statistics

View publication

6

Statistics

(1)

(3)

Improving Pre-trained CNN-LSTM Models for Image Captioning with Hyper-Parameter Optimization

CNN pre-trained models

LSTM

activation function

hyper-parameters

overfitting

Nuha M. Khassaf

Nada Hussein M. Ali

...Show More Authors

The issue of image captioning, which comprises automatic text generation to understand an image’s visual information, has become feasible with the developments in object recognition and image classification. Deep learning has received much interest from the scientific community and can be very useful in real-world applications. The proposed image captioning approach involves the use of Convolution Neural Network (CNN) pre-trained models combined with Long Short Term Memory (LSTM) to generate image captions. The process includes two stages. The first stage entails training the CNN-LSTM models using baseline hyper-parameters and the second stage encompasses training CNN-LSTM models by optimizing and adjusting the hyper-parameters of the previous stage. Improvements include the use of a new activation function, regular parameter tuning, and an improved learning rate in the later stages of training. The experimental results on the flickr8k dataset showed a noticeable and satisfactory improvement in the second stage, where a clear increment was achieved in the evaluation metrics Bleu1-4, Meteor, and Rouge-L. This increment confirmed the effectiveness of the alterations and highlighted the importance of hyper-parameter tuning in improving the performance of CNN-LSTM models in image caption tasks.

View Publication

Publication Date

Sun Nov 19 2017

Journal Name

Journal Of Al-qadisiyah For Computer Science And Mathematics

Image Compression based on Fixed Predictor Multiresolution Thresholding of Linear Polynomial Nearlossless Techniques

Ghadah

Shaymaa

...Show More Authors

Image compression is a serious issue in computer storage and transmission, that simply makes efficient use of redundancy embedded within an image itself; in addition, it may exploit human vision or perception limitations to reduce the imperceivable information Polynomial coding is a modern image compression technique based on modelling concept to remove the spatial redundancy embedded within the image effectively that composed of two parts, the mathematical model and the residual. In this paper, two stages proposed technqies adopted, that starts by utilizing the lossy predictor model along with multiresolution base and thresholding techniques corresponding to first stage. Latter by incorporating the near lossless com

View Publication

(1)

Publication Date

Sat Mar 13 2021

Journal Name

Al-nahrain Journal Of Science

Hiding Multi Short Audio Signals in Color Image by using Fast Fourier Transform

Steganography

Secret Audio Signal

Security

Image Frequency Transform

Hiding Multi Audio

Image Quality

Enas M.

...Show More Authors

Many purposes require communicating audio files between the users using different applications of social media. The security level of these applications is limited; at the same time many audio files are secured and must be accessed by authorized persons only, while, most present works attempt to hide single audio file in certain cover media. In this paper, a new approach of hiding three audio signals with unequal sizes in single color digital image has been proposed using the frequencies transform of this image. In the proposed approach, the Fast Fourier Transform was adopted where each audio signal is embedded in specific region with high frequencies in the frequency spectrum of the cover image to sa

View Publication

(1)

Publication Date

Sun Jun 01 2014

Journal Name

International Journal Of Advanced Research In Computer Science And Software Engineering

Medical Image Compression using Wavelet Quadrants of Polynomial Prediction Coding & Bit Plane Slicing

Ghadah

...Show More Authors

Publication Date

Mon Aug 30 2021

Journal Name

Al-kindy College Medical Journal

Psychological and Physical Correlates of Body Image Dissatisfaction among High School Egyptian Students

Anxiety

body shape concern

depression

Egypt

secondary school students

Walaa

Hesham

Randa

Hanan

Mahmoud

Mostafa

...Show More Authors

Background: Body image is one of the most important psychological factors that affects adolescents’ personality and behavior. Body image can be defined as the person’s perceptions, thoughts, and feelings about his or her body.

Objectives: to identify the prevalence of body image concerns among secondary school students and its relation to different factors.

Subjects and methods: A cross-sectional study conducted in which 796 secondary school students participated and body shape concerns was investigated using the body shape questionnaire (BSQ-34).

Results: The prevalence of moderate/marked concern was (21.6%). Moderate/ marked body shape concern was significantly associated

View Publication Preview PDF

(1)

Publication Date

Mon Apr 03 2023

Journal Name

Journal Of Al-qadisiyah For Computer Science And Mathematics

A General Overview on the Categories of Image Features Extraction Techniques: A Survey

Pixel-level feature

local feature

global feature

features detection

features description

edge

corner

blob or region

Rafal

...Show More Authors

In the image processing’s field and computer vision it’s important to represent the image by its information. Image information comes from the image’s features that extracted from it using feature detection/extraction techniques and features description. Features in computer vision define informative data. For human eye its perfect to extract information from raw image, but computer cannot recognize image information. This is why various feature extraction techniques have been presented and progressed rapidly. This paper presents a general overview of the feature extraction categories for image.

View Publication Preview PDF

Publication Date

Sun Mar 15 2020

Journal Name

Al-academy

Aesthetics of Hybrid Digital Image Technologies in TV Drama: محمد ثائر عدنان البياتي

Functionality - Aesthetic - Technologies - Hybrids

Mohammed Thair

...Show More Authors

TV medium derives its formal shape from the technological development taking place in all scientific fields, which are creatively fused in the image of the television, which consists mainly of various visual levels and formations. But by the new decade of the second millennium, the television medium and mainly (drama) became looking for that paradigm shift in the aesthetic formal innovative fields and the advanced expressive performative fields that enable it to develop in treating what was impossible to visualize previously. In the meantime, presenting what is new and innovative in the field of unprecedented and even the familiar objective and intellectual treatments. Thus the TV medium has sought for work

View Publication Preview PDF

Publication Date

Wed Sep 26 2018

Journal Name

Communications In Computer And Information Science

A New RGB Image Encryption Based on DNA Encoding and Multi-chaotic Maps

Sarab

Ibtisam

...Show More Authors

View Publication

(2)

Publication Date

Tue Feb 01 2022

Journal Name

Baghdad Science Journal

An Enhanced Approach of Image Steganographic Using Discrete Shearlet Transform and Secret Sharing

Discrete Shearlet Transform

Image Steganography

Stego Image

Secret Sharing.

Yasir Ahmed

Nada Elya

Mohammed Qasim

...Show More Authors

Recently, the internet has made the users able to transmit the digital media in the easiest manner. In spite of this facility of the internet, this may lead to several threats that are concerned with confidentiality of transferred media contents such as media authentication and integrity verification. For these reasons, data hiding methods and cryptography are used to protect the contents of digital media. In this paper, an enhanced method of image steganography combined with visual cryptography has been proposed. A secret logo (binary image) of size (128x128) is encrypted by applying (2 out 2 share) visual cryptography on it to generate two secret share. During the embedding process, a cover red, green, and blue (RGB) image of size (512

View Publication Preview PDF

(14)

(9)

Publication Date

Wed Dec 25 2019

Journal Name

Journal Of Engineering

Corrosion Rate Optimization of Mild-Steel under Different Cooling Tower Working Parameters Using Taguchi Design

Taguchi

corrosion

mild-steel

cooling tower

Shaimaa Abdul-Rahman

Hasan F.

...Show More Authors

This study investigates the implementation of Taguchi design in the estimation of minimum corrosion rate of mild-steel in cooling tower that uses saline solution of different concentration. The experiments were set on the basis of Taguchi’s L16 orthogonal array. The runs were carried out under different condition such as inlet concentration of saline solution, temperature, and flowrate. The Signal-to- Noise ratio and ANOVA analysis were used to define the impact of cooling tower working conditions on the corrosion rate. A regression had been modelled and optimized to identify the optimum level for the working parameters that had been founded to be 13%NaCl, 35ᴼC, and 1 l/min. Also a confirmation run to establish the p

View Publication Preview PDF

(3)

Publication Date

Thu Oct 26 2017

Journal Name

International Journal Of Pure And Applied Mathematics

ON CONVEX FUNCTIONS, $E$-CONVEX FUNCTIONS AND THEIR GENERALIZATIONS: APPLICATIONS TO NON-LINEAR OPTIMIZATION PROBLEMS

Saba Naser

M.I.

...Show More Authors

Contents IJPAM: Volume 116, No. 3 (2017)

View Publication

1 2 ... 95 96 97 98 ... 989 990