The issue of image captioning, which comprises automatic text generation to understand an image’s visual information, has become feasible with the developments in object recognition and image classification. Deep learning has received much interest from the scientific community and can be very useful in real-world applications. The proposed image captioning approach involves the use of Convolution Neural Network (CNN) pre-trained models combined with Long Short Term Memory (LSTM) to generate image captions. The process includes two stages. The first stage entails training the CNN-LSTM models using baseline hyper-parameters and the second stage encompasses training CNN-LSTM models by optimizing and adjusting the hyper-parameters of the previous stage. Improvements include the use of a new activation function, regular parameter tuning, and an improved learning rate in the later stages of training. The experimental results on the flickr8k dataset showed a noticeable and satisfactory improvement in the second stage, where a clear increment was achieved in the evaluation metrics Bleu1-4, Meteor, and Rouge-L. This increment confirmed the effectiveness of the alterations and highlighted the importance of hyper-parameter tuning in improving the performance of CNN-LSTM models in image caption tasks.
The effect of using three different interpolation methods (nearest neighbour, linear and non-linear) on a 3D sinogram to restore the missing data due to using angular difference greater than 1° (considered as optimum 3D sinogram) is presented. Two reconstruction methods are adopted in this study, the back-projection method and Fourier slice theorem method, from the results the second reconstruction proven to be a promising reconstruction with the linear interpolation method when the angular difference is less than 20°.
The searching process using a binary codebook of combined Block Truncation Coding (BTC) method and Vector Quantization (VQ), i.e. a full codebook search for each input image vector to find the best matched code word in the codebook, requires a long time. Therefore, in this paper, after designing a small binary codebook, we adopted a new method by rotating each binary code word in this codebook into 900 to 2700 step 900 directions. Then, we systematized each code word depending on its angle to involve four types of binary code books (i.e. Pour when , Flat when , Vertical when, or Zigzag). The proposed scheme was used for decreasing the time of the coding procedure, with very small distortion per block, by designing s
... Show MoreSemantic segmentation is an exciting research topic in medical image analysis because it aims to detect objects in medical images. In recent years, approaches based on deep learning have shown a more reliable performance than traditional approaches in medical image segmentation. The U-Net network is one of the most successful end-to-end convolutional neural networks (CNNs) presented for medical image segmentation. This paper proposes a multiscale Residual Dilated convolution neural network (MSRD-UNet) based on U-Net. MSRD-UNet replaced the traditional convolution block with a novel deeper block that fuses multi-layer features using dilated and residual convolution. In addition, the squeeze and execution attention mechanism (SE) and the s
... Show MoreThe recent emergence of sophisticated Large Language Models (LLMs) such as GPT-4, Bard, and Bing has revolutionized the domain of scientific inquiry, particularly in the realm of large pre-trained vision-language models. This pivotal transformation is driving new frontiers in various fields, including image processing and digital media verification. In the heart of this evolution, our research focuses on the rapidly growing area of image authenticity verification, a field gaining immense relevance in the digital era. The study is specifically geared towards addressing the emerging challenge of distinguishing between authentic images and deep fakes – a task that has become critically important in a world increasingly reliant on digital med
... Show MoreWith the rapid development of smart devices, people's lives have become easier, especially for visually disabled or special-needs people. The new achievements in the fields of machine learning and deep learning let people identify and recognise the surrounding environment. In this study, the efficiency and high performance of deep learning architecture are used to build an image classification system in both indoor and outdoor environments. The proposed methodology starts with collecting two datasets (indoor and outdoor) from different separate datasets. In the second step, the collected dataset is split into training, validation, and test sets. The pre-trained GoogleNet and MobileNet-V2 models are trained using the indoor and outdoor se
... Show MoreLinear motor offers several features in many applications that require linear motion. Nevertheless, the presence of cogging force can deteriorate the thrust of a permanent magnet linear motor. Using several methodologies, a design of synchronous single sided linear iron-core motor was proposed. According to exact formulas with surface-mounted magnets and concentrated winding specification, which are relying on geometrical parameters. Two-dimensional performance analysis of the designed model and its multi-objective optimization were accomplished as a method to reduce the motor cogging force using MAXWELL ANSYS. The optimum model design results showed that the maximum force ripple was approximatrly reduced by 81.24%compared to the origina
... Show MoreFerric oxide nanoparticles Fe3O4NPs have been prepared by the coprecipitation method, which were used to functionalize the surface of electrospun nanofibers of polyacrylonitrile to increase their effectiveness in adsorption of Congo red (CR) dye from their aqueous solutions. The effect factors of adsorption were systematically investigated such as adsorbent mass, initial concentration, contact time, temperature, ionic strength and pH. The maximum adsorbed amount of the dye was at 0.003g of adsorbent. The adsorption of dye increased with increasing initial dye concentration and the system reaches to the equilibrium state at 150 min. The adsorbed dye capacity decreases with increasing temperature which indicates to the exothermic nature of ad
... Show MoreBiogas is one of the most important sources of renewable energy and is considered as an environment friendly energy source. The major goal of this research is to see if rice husk (Rh) waste and pomegranate peels (PP) waste are suitable for anaerobic digestion and what effect NaOH pre-treatment has on biogas generation. Rice husk and pomegranate peels were tested in anaerobic digestion under patch anaerobic conditions as separate wastes as well as blended together in equal proportions. The cumulative biogas output for the blank test (no pretreatment) was 1923 and 2526 ml, respectively using a single rice husk (Rh) and pomegranate peel (PP) substrates. The 50% rice husk digestion and 50% of pomegranate peels for blank test gave the result 224
... Show MoreThe aim of human lower limb rehabilitation robot is to regain the ability of motion and to strengthen the weak muscles. This paper proposes the design of a force-position control for a four Degree Of Freedom (4-DOF) lower limb wearable rehabilitation robot. This robot consists of a hip, knee and ankle joints to enable the patient for motion and turn in both directions. The joints are actuated by Pneumatic Muscles Actuators (PMAs). The PMAs have very great potential in medical applications because the similarity to biological muscles. Force-Position control incorporating a Takagi-Sugeno-Kang- three- Proportional-Derivative like Fuzzy Logic (TSK-3-PD) Controllers for position control and three-Proportional (3-P) controllers for force contr
... Show More