This study explores the challenges in Artificial Intelligence (AI) systems in generating image captions, a task that requires effective integration of computer vision and natural language processing techniques. A comparative analysis between traditional approaches such as retrieval- based methods and linguistic templates) and modern approaches based on deep learning such as encoder-decoder models, attention mechanisms, and transformers). Theoretical results show that modern models perform better for the accuracy and the ability to generate more complex descriptions, while traditional methods outperform speed and simplicity. The paper proposes a hybrid framework that combines the advantages of both approaches, where conventional methods produce an initial description, which is then contextually, and refined using modern models. Preliminary estimates indicate that this approach could reduce the initial computational cost by up to 20% compared to relying entirely on deep models while maintaining high accuracy. The study recommends further research to develop effective coordination mechanisms between traditional and modern methods and to move to the experimental validation phase of the hybrid model in preparation for its application in environments that require a balance between speed and accuracy, such as real-time computer vision applications.
In the reverse engineering approach, a massive amount of point data is gathered together during data acquisition and this leads to larger file sizes and longer information data handling time. In addition, fitting of surfaces of these data point is time-consuming and demands particular skills. In the present work a method for getting the control points of any profile has been presented. Where, many process for an image modification was explained using Solid Work program, and a parametric equation of the profile that proposed has been derived using Bezier technique with the control points that adopted. Finally, the proposed profile was machined using 3-aixs CNC milling machine and a compression in dimensions process has been occurred betwe
... Show MoreGroupwise non-rigid image alignment is a difficult non-linear optimization problem involving many parameters and often large datasets. Previous methods have explored various metrics and optimization strategies. Good results have been previously achieved with simple metrics, requiring complex optimization, often with many unintuitive parameters that require careful tuning for each dataset. In this chapter, the problem is restructured to use a simpler, iterative optimization algorithm, with very few free parameters. The warps are refined using an iterative Levenberg-Marquardt minimization to the mean, based on updating the locations of a small number of points and incorporating a stiffness constraint. This optimization approach is eff
... Show MoreWith the increased development in digital media and communication, the need for methods to protection and security became very important factor, where the exchange and transmit date over communication channel led to make effort to protect these data from unauthentication access.
This paper present a new method to protect color image from unauthentication access using watermarking. The watermarking algorithm hide the encoded mark image in frequency domain using Discrete Cosine Transform. The main principle of the algorithm is encode frequent mark in cover color image. The watermark image bits are spread by repeat the mark and arrange in encoded method that provide algorithm more robustness and security. The propos
... Show MoreMedical image segmentation is one of the most actively studied fields in the past few decades, as the development of modern imaging modalities such as magnetic resonance imaging (MRI) and computed tomography (CT), physicians and technicians nowadays have to process the increasing number and size of medical images. Therefore, efficient and accurate computational segmentation algorithms become necessary to extract the desired information from these large data sets. Moreover, sophisticated segmentation algorithms can help the physicians delineate better the anatomical structures presented in the input images, enhance the accuracy of medical diagnosis and facilitate the best treatment planning. Many of the proposed algorithms could perform w
... Show MoreThis paper presents the matrix completion problem for image denoising. Three problems based on matrix norm are performing: Spectral norm minimization problem (SNP), Nuclear norm minimization problem (NNP), and Weighted nuclear norm minimization problem (WNNP). In general, images representing by a matrix this matrix contains the information of the image, some information is irrelevant or unfavorable, so to overcome this unwanted information in the image matrix, information completion is used to comperes the matrix and remove this unwanted information. The unwanted information is handled by defining {0,1}-operator under some threshold. Applying this operator on a given ma
... Show MoreWe explore the transform coefficients of fractal and exploit new method to improve the compression capabilities of these schemes. In most of the standard encoder/ decoder systems the quantization/ de-quantization managed as a separate step, here we introduce new way (method) to work (managed) simultaneously. Additional compression is achieved by this method with high image quality as you will see later.