This paper presents a new algorithm in an important research field which is the semantic word similarity estimation. A new feature-based algorithm is proposed for measuring the word semantic similarity for the Arabic language. It is a highly systematic language where its words exhibit elegant and rigorous logic. The score of sematic similarity between two Arabic words is calculated as a function of their common and total taxonomical features. An Arabic knowledge source is employed for extracting the taxonomical features as a set of all concepts that subsumed the concepts containing the compared words. The previously developed Arabic word benchmark datasets are used for optimizing and evaluating the proposed algorithm. In this paper,
... Show MoreEstimating the semantic similarity between short texts plays an increasingly prominent role in many fields related to text mining and natural language processing applications, especially with the large increase in the volume of textual data that is produced daily. Traditional approaches for calculating the degree of similarity between two texts, based on the words they share, do not perform well with short texts because two similar texts may be written in different terms by employing synonyms. As a result, short texts should be semantically compared. In this paper, a semantic similarity measurement method between texts is presented which combines knowledge-based and corpus-based semantic information to build a semantic network that repre
... Show MoreCassava, a significant crop in Africa, Asia, and South America, is a staple food for millions. However, classifying cassava species using conventional color, texture, and shape features is inefficient, as cassava leaves exhibit similarities across different types, including toxic and non-toxic varieties. This research aims to overcome the limitations of traditional classification methods by employing deep learning techniques with pre-trained AlexNet as the feature extractor to accurately classify four types of cassava: Gajah, Manggu, Kapok, and Beracun. The dataset was collected from local farms in Lamongan Indonesia. To collect images with agricultural research experts, the dataset consists of 1,400 images, and each type of cassava has
... Show MoreIn this paper a method to determine whether an image is forged (spliced) or not is presented. The proposed method is based on a classification model to determine the authenticity of a tested image. Image splicing causes many sharp edges (high frequencies) and discontinuities to appear in the spliced image. Capturing these high frequencies in the wavelet domain rather than in the spatial domain is investigated in this paper. Correlation between high-frequency sub-bands coefficients of Discrete Wavelet Transform (DWT) is also described using co-occurrence matrix. This matrix was an input feature vector to a classifier. The best accuracy of 92.79% and 94.56% on Casia v1.0 and Casia v2.0 datasets respectively was achieved. This pe
... Show MoreIn information security, fingerprint verification is one of the most common recent approaches for verifying human identity through a distinctive pattern. The verification process works by comparing a pair of fingerprint templates and identifying the similarity/matching among them. Several research studies have utilized different techniques for the matching process such as fuzzy vault and image filtering approaches. Yet, these approaches are still suffering from the imprecise articulation of the biometrics’ interesting patterns. The emergence of deep learning architectures such as the Convolutional Neural Network (CNN) has been extensively used for image processing and object detection tasks and showed an outstanding performance compare
... Show MoreA substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques.
... Show MoreIn many video and image processing applications, the frames are partitioned into blocks, which are extracted and processed sequentially. In this paper, we propose a fast algorithm for calculation of features of overlapping image blocks. We assume the features are projections of the block on separable 2D basis functions (usually orthogonal polynomials) where we benefit from the symmetry with respect to spatial variables. The main idea is based on a construction of auxiliary matrices that virtually extends the original image and makes it possible to avoid a time-consuming computation in loops. These matrices can be pre-calculated, stored and used repeatedly since they are independent of the image itself. We validated experimentally th
... Show More