Improving Pre-trained CNN-LSTM Models for Image Captioning with Hyper-Parameter Optimization

Nuha M. Khassaf; Nada Hussein M. Ali

doi:10.48084/etasr.8455

Details

Publication Date

Wed Oct 09 2024

Journal Name

Engineering, Technology & Applied Science Research

Volume

14

Issue Number

5

DOI

10.48084/etasr.8455

Choose Citation Style

Statistics

View publication

10

Statistics

(9)

(5)

Improving Pre-trained CNN-LSTM Models for Image Captioning with Hyper-Parameter Optimization

CNN pre-trained models

LSTM

activation function

hyper-parameters

overfitting

Nuha M. Khassaf

Nada Hussein M. Ali

...Show More Authors

The issue of image captioning, which comprises automatic text generation to understand an image’s visual information, has become feasible with the developments in object recognition and image classification. Deep learning has received much interest from the scientific community and can be very useful in real-world applications. The proposed image captioning approach involves the use of Convolution Neural Network (CNN) pre-trained models combined with Long Short Term Memory (LSTM) to generate image captions. The process includes two stages. The first stage entails training the CNN-LSTM models using baseline hyper-parameters and the second stage encompasses training CNN-LSTM models by optimizing and adjusting the hyper-parameters of the previous stage. Improvements include the use of a new activation function, regular parameter tuning, and an improved learning rate in the later stages of training. The experimental results on the flickr8k dataset showed a noticeable and satisfactory improvement in the second stage, where a clear increment was achieved in the evaluation metrics Bleu1-4, Meteor, and Rouge-L. This increment confirmed the effectiveness of the alterations and highlighted the importance of hyper-parameter tuning in improving the performance of CNN-LSTM models in image caption tasks.

View Publication

Publication Date

Tue Jul 01 2025

Journal Name

Mastering The Minds Of Machines

The Impact of Transfer Learning and Pre-trained Models on Model Performance

Nada Khalil

Amir H.

Samila

Han-Liwa

Haming

Shengxiang

Aseel

Laith

...Show More Authors

View Publication

Publication Date

Mon Jan 01 2024

Journal Name

Lecture Notes On Data Engineering And Communications Technologies

Utilizing Deep Learning Technique for Arabic Image Captioning

Haneen serag

...Show More Authors

View Publication

(3)

(2)

Publication Date

Thu Sep 01 2016

Journal Name

2016 8th Computer Science And Electronic Engineering (ceec)

Class-specific pre-trained sparse autoencoders for learning effective features for document classification

Maysa

...Show More Authors

View Publication

(6)

(2)

Publication Date

Thu May 01 2025

Journal Name

2025 3rd International Conference On Business Analytics For Technology And Security (icbats)

Comparison of Deep Neural Network Models (LSTM, Bi-LSTM, GRU and Bi-GRU) for Gold Price Prediction

Deep Learning

RNN

LSTM

GRU

Bi-LSTM

Bi-GRU

Gold Prices Prediction

Noor Saleem Mohammed

Balsam Mustafa

...Show More Authors

This research studies the comparison of deep neural network models and performance evaluation to predict the gold prices of time series, where the gold prices contain high fluctuations and non-linear patterns that are difficult to capture using traditional models, which makes predicting them a significant challenge. Therefore, the focus was on using deep learning models represented by (LSTM), (Bi-LSTM), (GRU) and (Bi-GRU). The results showed the superiority of the (Bi-GRU) model according to comparison criteria (MSE), (RMSE), (MAE), and (R∧2) compared to other models because it was able to understand the time patterns better by processing the data in both directions and provided superior performance, which indicates its effectiveness, eff

View Publication

(1)

Publication Date

Thu Jul 03 2025

Journal Name

2025 3rd International Conference On Cyber Resilience (iccr)

Fine-Grained Emotion Recognition from Short Video Clips Using CNN-LSTM with Facial Action Heatmaps

Zahraa Haimeed

Haneen Siraj

Enas Ahmed

Mina Taha

Nourhan Ahmed

Azhaar A.

Wael Yahya

Inbithaq Ahmed

...Show More Authors

View Publication

Publication Date

Fri Mar 29 2024

Journal Name

Iraqi Journal Of Science

Evaluating the Performance and Behavior of CNN, LSTM, and GRU for Classification and Prediction Tasks

Hasanen S.

Nada Hussain

Nada A.Z.

...Show More Authors

Deep learning (DL) plays a significant role in several tasks, especially classification and prediction. Classification tasks can be efficiently achieved via convolutional neural networks (CNN) with a huge dataset, while recurrent neural networks (RNN) can perform prediction tasks due to their ability to remember time series data. In this paper, three models have been proposed to certify the evaluation track for classification and prediction tasks associated with four datasets (two for each task). These models are CNN and RNN, which include two models (Long Short Term Memory (LSTM)) and GRU (Gated Recurrent Unit). Each model is employed to work consequently over the two mentioned tasks to draw a road map of deep learning mod

View Publication

(16)

(7)

Publication Date

Mon Jun 01 2026

Journal Name

Statistics, Optimization & Information Computing

Predicting Public Budget Surplus and Deficit Using a Hybrid 1D-CNN–LSTM Model

Sulaiman Hussien

Munaf Yousif

Zahraa Yousif

...Show More Authors

The fiscal position of governments in rentier economies depends heavily on oil revenues. The relationship between oil prices and the budget surplus or deficit is often nonlinear and characterized by complex temporal dependencies, which may limit the predictive capability of conventional econometric models. Accordingly, this study aims to forecast the Iraqi budget surplus and deficit and compare the predictive performance of the ARDL, NARDL, LSTM, 1D-CNN, and hybrid 1D-CNN-LSTM models using oil prices as the primary predictive variable. The hybrid model integrates the feature-extraction capability of One-Dimensional Convolutional Neural Networks (1D-CNN) with the ability of Long Short-Term Memory (LSTM) networks to capture long-term

Publication Date

Tue Dec 16 2025

Journal Name

Radioelectronics. Nanosystems. Information Technologies.

Intelligent Control and Stability Analysis of Smart Grids Using CNN-LSTM Network and Model Predictive Controller

Model Predictive Control (MPC)

Intelligent Control Systems

Residual CNN–LSTM

Real-time Grid Monitoring

SHAP Explainability

Aws

...Show More Authors

It is important that real time stability in smart grids is ensured as the integration of renewables and the complexity of the systems grows. In this paper, we provide a solid architecture, which combines a Residual CNNLSTM deep neural network predictor, FPGA-accelerated Model Predictive Control (MPC), and SHAP-based explainability. The proposed method predicted with 99.8% accuracy using the Electrical grid Stability Simulated Dataset (UCI) and minimized the instability rates surpassing 85 percent in all operating conditions. Meeting real-time operating needs, FPGA deployment on a Xilinx Zynq UltraScale+ provided 3.1 ms latency and 5 times reduced energy consumption against CPU processing. By emphasizing bus voltage and frequency as major in

View Publication Preview PDF

Publication Date

Sun Jan 01 2023

Journal Name

International Journal Of Nonlinear Analysis And Applications

The use of ARIMA, LSTM and GRU models in time series hybridization with practical application

ARIMA Long Short Term Memory Time Series Forecasting Gated Recurrent Unit Hybrid Model

Noor Saleem

Firas Ahmmed

...Show More Authors

The importance of forecasting has emerged in the economic field in order to achieve economic growth, as forecasting is one of the important topics in the analysis of time series, and accurate forecasting of time series is one of the most important challenges in which we seek to make the best decision. The aim of the research is to suggest the use of hybrid models for forecasting the daily crude oil prices as the hybrid model consists of integrating the linear component, which represents Box Jenkins models and the non-linear component, which represents one of the methods of artificial intelligence, which is long short term memory (LSTM) and the gated recurrent unit (GRU) which represents deep learning models. It was found that the proposed h

View Publication Preview PDF

Publication Date

Tue Mar 30 2021

Journal Name

Journal Of Economics And Administrative Sciences

The Bayesian Estimation for The Shape Parameter of The Power Function Distribution (PFD-I) to Use Hyper Prior Functions

The power function distribution (PFD-I)

MLE

Bayes Estimation

SELF

WSELF

MLINEX.

Jinan Abbas

...Show More Authors

The objective of this study is to examine the properties of Bayes estimators of the shape parameter of the Power Function Distribution (PFD-I), by using two different prior distributions for the parameter θ and different loss functions that were compared with the maximum likelihood estimators. In many practical applications, we may have two different prior information about the prior distribution for the shape parameter of the Power Function Distribution, which influences the parameter estimation. So, we used two different kinds of conjugate priors of shape parameter θ of the <

View Publication Preview PDF

1 2 3 4 ... 1076 1077 1078 1079