The image caption is the process of adding an explicit, coherent description to the contents of the image. This is done by using the latest deep learning techniques, which include computer vision and natural language processing, to understand the contents of the image and give it an appropriate caption. Multiple datasets suitable for many applications have been proposed. The biggest challenge for researchers with natural language processing is that the datasets are incompatible with all languages. The researchers worked on translating the most famous English data sets with Google Translate to understand the content of the images in their mother tongue. In this paper, the proposed review aims to enhance the understanding of image captioning strategies and to survey previous research related to image captioning while examining the most popular databases in different languages, mostly English, translating into other languages using the latest models for describing images, summarizing evaluation measures, and comparing them.
Image retrieval is used in searching for images from images database. In this paper, content – based image retrieval (CBIR) using four feature extraction techniques has been achieved. The four techniques are colored histogram features technique, properties features technique, gray level co- occurrence matrix (GLCM) statistical features technique and hybrid technique. The features are extracted from the data base images and query (test) images in order to find the similarity measure. The similarity-based matching is very important in CBIR, so, three types of similarity measure are used, normalized Mahalanobis distance, Euclidean distance and Manhattan distance. A comparison between them has been implemented. From the results, it is conclud
... Show MoreIn this paper two main stages for image classification has been presented. Training stage consists of collecting images of interest, and apply BOVW on these images (features extraction and description using SIFT, and vocabulary generation), while testing stage classifies a new unlabeled image using nearest neighbor classification method for features descriptor. Supervised bag of visual words gives good result that are present clearly in the experimental part where unlabeled images are classified although small number of images are used in the training process.
Pavement crack and pothole identification are important tasks in transportation maintenance and road safety. This study offers a novel technique for automatic asphalt pavement crack and pothole detection which is based on image processing. Different types of cracks (transverse, longitudinal, alligator-type, and potholes) can be identified with such techniques. The goal of this research is to evaluate road surface damage by extracting cracks and potholes, categorizing them from images and videos, and comparing the manual and the automated methods. The proposed method was tested on 50 images. The results obtained from image processing showed that the proposed method can detect cracks and potholes and identify their severity levels wit
... Show MoreAn image retrieval system is a computer system for browsing, looking and recovering pictures from a huge database of advanced pictures. The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image. The researchers were developing a new mechanism to retrieval systems which is mainly based on two procedures. The first procedure relies on extract the statistical feature of both original, traditional image by using the histogram and statistical characteristics (mean, standard deviation). The second procedure relies on the T-
... Show MoreThe field of autonomous robotic systems has advanced tremendously in the last few years, allowing them to perform complicated tasks in various contexts. One of the most important and useful applications of guide robots is the support of the blind. The successful implementation of this study requires a more accurate and powerful self-localization system for guide robots in indoor environments. This paper proposes a self-localization system for guide robots. To successfully implement this study, images were collected from the perspective of a robot inside a room, and a deep learning system such as a convolutional neural network (CNN) was used. An image-based self-localization guide robot image-classification system delivers a more accura
... Show MoreWe explore the transform coefficients of fractal and exploit new method to improve the compression capabilities of these schemes. In most of the standard encoder/ decoder systems the quantization/ de-quantization managed as a separate step, here we introduce new way (method) to work (managed) simultaneously. Additional compression is achieved by this method with high image quality as you will see later.
In this paper three techniques for image compression are implemented. The proposed techniques consist of three dimension (3-D) two level discrete wavelet transform (DWT), 3-D two level discrete multi-wavelet transform (DMWT) and 3-D two level hybrid (wavelet-multiwavelet transform) technique. Daubechies and Haar are used in discrete wavelet transform and Critically Sampled preprocessing is used in discrete multi-wavelet transform. The aim is to maintain to increase the compression ratio (CR) with respect to increase the level of the transformation in case of 3-D transformation, so, the compression ratio is measured for each level. To get a good compression, the image data properties, were measured, such as, image entropy (He), percent root-
... Show MoreBuilding a system to identify individuals through their speech recording can find its application in diverse areas, such as telephone shopping, voice mail and security control. However, building such systems is a tricky task because of the vast range of differences in the human voice. Thus, selecting strong features becomes very crucial for the recognition system. Therefore, a speaker recognition system based on new spin-image descriptors (SISR) is proposed in this paper. In the proposed system, circular windows (spins) are extracted from the frequency domain of the spectrogram image of the sound, and then a run length matrix is built for each spin, to work as a base for feature extraction tasks. Five different descriptors are generated fro
... Show MoreHM Al-Dabbas, RA Azeez, AE Ali, Iraqi Journal of Science, 2023
In the reverse engineering approach, a massive amount of point data is gathered together during data acquisition and this leads to larger file sizes and longer information data handling time. In addition, fitting of surfaces of these data point is time-consuming and demands particular skills. In the present work a method for getting the control points of any profile has been presented. Where, many process for an image modification was explained using Solid Work program, and a parametric equation of the profile that proposed has been derived using Bezier technique with the control points that adopted. Finally, the proposed profile was machined using 3-aixs CNC milling machine and a compression in dimensions process has been occurred betwe
... Show More