With the rapid development of smart devices, people's lives have become easier, especially for visually disabled or special-needs people. The new achievements in the fields of machine learning and deep learning let people identify and recognise the surrounding environment. In this study, the efficiency and high performance of deep learning architecture are used to build an image classification system in both indoor and outdoor environments. The proposed methodology starts with collecting two datasets (indoor and outdoor) from different separate datasets. In the second step, the collected dataset is split into training, validation, and test sets. The pre-trained GoogleNet and MobileNet-V2 models are trained using the indoor and outdoor sets, resulting in four trained models. The test sets are used to evaluate the trained models using many evaluation metrics (accuracy, TPR, FNR, PPR, FDR). Results of Google Net model indicate the high performance of the designed models with 99.34% and 99.76% accuracies for indoor and outdoor datasets, respectively. For Mobile Net models, the result accuracies are 99.27% and 99.68% for indoor and outdoor sets, respectively. The proposed methodology is compared with similar ones in the field of object recognition and image classification, and the comparative study proves the transcendence of the propsed system.
In this paper, an approach for object tracking that is inspired from human oculomotor system is proposed and verified experimentally. The developed approach divided into two phases, fast tracking or saccadic phase and smooth pursuit phase. In the first phase, the field of the view is segmented into four regions that are analogue to retinal periphery in the oculomotor system. When the object of interest is entering these regions, the developed vision system responds by changing the values of the pan and tilt angles to allow the object lies in the fovea area and then the second phase will activate. A fuzzy logic method is implemented in the saccadic phase as an intelligent decision maker to select the values of the pan and tilt angle based
... Show MoreDuring COVID-19, wearing a mask was globally mandated in various workplaces, departments, and offices. New deep learning convolutional neural network (CNN) based classifications were proposed to increase the validation accuracy of face mask detection. This work introduces a face mask model that is able to recognize whether a person is wearing mask or not. The proposed model has two stages to detect and recognize the face mask; at the first stage, the Haar cascade detector is used to detect the face, while at the second stage, the proposed CNN model is used as a classification model that is built from scratch. The experiment was applied on masked faces (MAFA) dataset with images of 160x160 pixels size and RGB color. The model achieve
... Show MoreFace recognition, emotion recognition represent the important bases for the human machine interaction. To recognize the person’s emotion and face, different algorithms are developed and tested. In this paper, an enhancement face and emotion recognition algorithm is implemented based on deep learning neural networks. Universal database and personal image had been used to test the proposed algorithm. Python language programming had been used to implement the proposed algorithm.
Nowadays, people's expression on the Internet is no longer limited to text, especially with the rise of the short video boom, leading to the emergence of a large number of modal data such as text, pictures, audio, and video. Compared to single mode data ,the multi-modal data always contains massive information. The mining process of multi-modal information can help computers to better understand human emotional characteristics. However, because the multi-modal data show obvious dynamic time series features, it is necessary to solve the dynamic correlation problem within a single mode and between different modes in the same application scene during the fusion process. To solve this problem, in this paper, a feature extraction framework of
... Show MoreAn image retrieval system is a computer system for browsing, looking and recovering pictures from a huge database of advanced pictures. The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image. The researchers were developing a new mechanism to retrieval systems which is mainly based on two procedures. The first procedure relies on extract the statistical feature of both original, traditional image by using the histogram and statistical characteristics (mean, standard deviation). The second procedure relies on the T-
... Show MoreThe general health of palm trees, encompassing the roots, stems, and leaves, significantly impacts palm oil production, therefore, meticulous attention is needed to achieve optimal yield. One of the challenges encountered in sustaining productive crops is the prevalence of pests and diseases afflicting oil palm plants. These diseases can detrimentally influence growth and development, leading to decreased productivity. Oil palm productivity is closely related to the conditions of its leaves, which play a vital role in photosynthesis. This research employed a comprehensive dataset of 1,230 images, consisting of 410 showing leaves, another 410 depicting bagworm infestations, and an additional 410 displaying caterpillar infestations. Furthe
... Show MoreDiabetic retinopathy is an eye disease in diabetic patients due to damage to the small blood vessels in the retina due to high and low blood sugar levels. Accurate detection and classification of Diabetic Retinopathy is an important task in computer-aided diagnosis, especially when planning for diabetic retinopathy surgery. Therefore, this study aims to design an automated model based on deep learning, which helps ophthalmologists detect and classify diabetic retinopathy severity through fundus images. In this work, a deep convolutional neural network (CNN) with transfer learning and fine tunes has been proposed by using pre-trained networks known as Residual Network-50 (ResNet-50). The overall framework of the proposed
... Show MoreImage recognition is one of the most important applications of information processing, in this paper; a comparison between 3-level techniques based image recognition has been achieved, using discrete wavelet (DWT) and stationary wavelet transforms (SWT), stationary-stationary-stationary (sss), stationary-stationary-wavelet (ssw), stationary-wavelet-stationary (sws), stationary-wavelet-wavelet (sww), wavelet-stationary- stationary (wss), wavelet-stationary-wavelet (wsw), wavelet-wavelet-stationary (wws) and wavelet-wavelet-wavelet (www). A comparison between these techniques has been implemented. according to the peak signal to noise ratio (PSNR), root mean square error (RMSE), compression ratio (CR) and the coding noise e (n) of each third
... Show MoreThe huge amount of documents in the internet led to the rapid need of text classification (TC). TC is used to organize these text documents. In this research paper, a new model is based on Extreme Machine learning (EML) is used. The proposed model consists of many phases including: preprocessing, feature extraction, Multiple Linear Regression (MLR) and ELM. The basic idea of the proposed model is built upon the calculation of feature weights by using MLR. These feature weights with the extracted features introduced as an input to the ELM that produced weighted Extreme Learning Machine (WELM). The results showed a great competence of the proposed WELM compared to the ELM.
The field of Optical Character Recognition (OCR) is the process of converting an image of text into a machine-readable text format. The classification of Arabic manuscripts in general is part of this field. In recent years, the processing of Arabian image databases by deep learning architectures has experienced a remarkable development. However, this remains insufficient to satisfy the enormous wealth of Arabic manuscripts. In this research, a deep learning architecture is used to address the issue of classifying Arabic letters written by hand. The method based on a convolutional neural network (CNN) architecture as a self-extractor and classifier. Considering the nature of the dataset images (binary images), the contours of the alphabet
... Show More