With the rapid development of smart devices, people's lives have become easier, especially for visually disabled or special-needs people. The new achievements in the fields of machine learning and deep learning let people identify and recognise the surrounding environment. In this study, the efficiency and high performance of deep learning architecture are used to build an image classification system in both indoor and outdoor environments. The proposed methodology starts with collecting two datasets (indoor and outdoor) from different separate datasets. In the second step, the collected dataset is split into training, validation, and test sets. The pre-trained GoogleNet and MobileNet-V2 models are trained using the indoor and outdoor sets, resulting in four trained models. The test sets are used to evaluate the trained models using many evaluation metrics (accuracy, TPR, FNR, PPR, FDR). Results of Google Net model indicate the high performance of the designed models with 99.34% and 99.76% accuracies for indoor and outdoor datasets, respectively. For Mobile Net models, the result accuracies are 99.27% and 99.68% for indoor and outdoor sets, respectively. The proposed methodology is compared with similar ones in the field of object recognition and image classification, and the comparative study proves the transcendence of the propsed system.
In this paper, an approach for object tracking that is inspired from human oculomotor system is proposed and verified experimentally. The developed approach divided into two phases, fast tracking or saccadic phase and smooth pursuit phase. In the first phase, the field of the view is segmented into four regions that are analogue to retinal periphery in the oculomotor system. When the object of interest is entering these regions, the developed vision system responds by changing the values of the pan and tilt angles to allow the object lies in the fovea area and then the second phase will activate. A fuzzy logic method is implemented in the saccadic phase as an intelligent decision maker to select the values of the pan and tilt angle based
... Show MoreImage classification is the process of finding common features in images from various classes and applying them to categorize and label them. The main problem of the image classification process is the abundance of images, the high complexity of the data, and the shortage of labeled data, presenting the key obstacles in image classification. The cornerstone of image classification is evaluating the convolutional features retrieved from deep learning models and training them with machine learning classifiers. This study proposes a new approach of “hybrid learning” by combining deep learning with machine learning for image classification based on convolutional feature extraction using the VGG-16 deep learning model and seven class
... Show MoreThis paper presents a hierarchical two-stage outdoor scene classification method using multi-classes of Support Vector Machine (SVM). In this proposed method, the gist feature of all the images in the database is extracted first to obtain the feature vectors. The image of database is classified into eight outdoor scenes classes, four manmade scenes and four natural scenes. Second, a hierarchical classification is applied, where the first stage classifies all manmade scene classes against all natural scene classes, while the second stage of a hierarchical classification classifies the outputs of first stage into either one of the four manmade scene classes or natural scene classes. Binary SVM and multi-classes SVMs are employed in the fir
... Show MoreText categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accu
... Show MoreThe transition of customers from one telecom operator to another has a direct impact on the company's growth and revenue. Traditional classification algorithms fail to predict churn effectively. This research introduces a deep learning model for predicting customers planning to leave to another operator. The model works on a high-dimensional large-scale data set. The performance of the model was measured against other classification algorithms, such as Gaussian NB, Random Forrest, and Decision Tree in predicting churn. The evaluation was performed based on accuracy, precision, recall, F-measure, Area Under Curve (AUC), and Receiver Operating Characteristic (ROC) Curve. The proposed deep learning model performs better than othe
... Show MoreDuring COVID-19, wearing a mask was globally mandated in various workplaces, departments, and offices. New deep learning convolutional neural network (CNN) based classifications were proposed to increase the validation accuracy of face mask detection. This work introduces a face mask model that is able to recognize whether a person is wearing mask or not. The proposed model has two stages to detect and recognize the face mask; at the first stage, the Haar cascade detector is used to detect the face, while at the second stage, the proposed CNN model is used as a classification model that is built from scratch. The experiment was applied on masked faces (MAFA) dataset with images of 160x160 pixels size and RGB color. The model achieve
... Show MoreNowadays, people's expression on the Internet is no longer limited to text, especially with the rise of the short video boom, leading to the emergence of a large number of modal data such as text, pictures, audio, and video. Compared to single mode data ,the multi-modal data always contains massive information. The mining process of multi-modal information can help computers to better understand human emotional characteristics. However, because the multi-modal data show obvious dynamic time series features, it is necessary to solve the dynamic correlation problem within a single mode and between different modes in the same application scene during the fusion process. To solve this problem, in this paper, a feature extraction framework of
... Show MoreDetection and classification of animals is a major challenge that is facing the researchers. There are five classes of vertebrate animals, namely the Mammals, Amphibians, Reptiles, Birds, and Fish, and each type includes many thousands of different animals. In this paper, we propose a new model based on the training of deep convolutional neural networks (CNN) to detect and classify two classes of vertebrate animals (Mammals and Reptiles). Deep CNNs are the state of the art in image recognition and are known for their high learning capacity, accuracy, and robustness to typical object recognition challenges. The dataset of this system contains 6000 images, including 4800 images for training. The proposed algorithm was tested by using 1200
... Show MoreThe general health of palm trees, encompassing the roots, stems, and leaves, significantly impacts palm oil production, therefore, meticulous attention is needed to achieve optimal yield. One of the challenges encountered in sustaining productive crops is the prevalence of pests and diseases afflicting oil palm plants. These diseases can detrimentally influence growth and development, leading to decreased productivity. Oil palm productivity is closely related to the conditions of its leaves, which play a vital role in photosynthesis. This research employed a comprehensive dataset of 1,230 images, consisting of 410 showing leaves, another 410 depicting bagworm infestations, and an additional 410 displaying caterpillar infestations. Furthe
... Show MoreDiabetic retinopathy is an eye disease in diabetic patients due to damage to the small blood vessels in the retina due to high and low blood sugar levels. Accurate detection and classification of Diabetic Retinopathy is an important task in computer-aided diagnosis, especially when planning for diabetic retinopathy surgery. Therefore, this study aims to design an automated model based on deep learning, which helps ophthalmologists detect and classify diabetic retinopathy severity through fundus images. In this work, a deep convolutional neural network (CNN) with transfer learning and fine tunes has been proposed by using pre-trained networks known as Residual Network-50 (ResNet-50). The overall framework of the proposed
... Show More