Estimating the semantic similarity between short texts plays an increasingly prominent role in many fields related to text mining and natural language processing applications, especially with the large increase in the volume of textual data that is produced daily. Traditional approaches for calculating the degree of similarity between two texts, based on the words they share, do not perform well with short texts because two similar texts may be written in different terms by employing synonyms. As a result, short texts should be semantically compared. In this paper, a semantic similarity measurement method between texts is presented which combines knowledge-based and corpus-based semantic information to build a semantic network that represents the relationship between the compared texts and extracts the degree of similarity between them. Representing a text as a semantic network is the best knowledge representation that comes close to the human mind's understanding of the texts, where the semantic network reflects the sentence's semantic, syntactical, and structural knowledge. The network representation is a visual representation of knowledge objects, their qualities, and their relationships. WordNet lexical database has been used as a knowledge-based source while the GloVe pre-trained word embedding vectors have been used as a corpus-based source. The proposed method was tested using three different datasets, DSCS, SICK, and MOHLER datasets. A good result has been obtained in terms of RMSE and MAE.
Content-based image retrieval has been keenly developed in numerous fields. This provides more active management and retrieval of images than the keyword-based method. So the content based image retrieval becomes one of the liveliest researches in the past few years. In a given set of objects, the retrieval of information suggests solutions to search for those in response to a particular description. The set of objects which can be considered are documents, images, videos, or sounds. This paper proposes a method to retrieve a multi-view face from a large face database according to color and texture attributes. Some of the features used for retrieval are color attributes such as the mean, the variance, and the color image's bitmap. In add
... Show MoreDocument clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research wor
... Show MoreAn image retrieval system is a computer system for browsing, looking and recovering pictures from a huge database of advanced pictures. The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image. The researchers were developing a new mechanism to retrieval systems which is mainly based on two procedures. The first procedure relies on extract the statistical feature of both original, traditional image by using the histogram and statistical characteristics (mean, standard deviation). The second procedure relies on the T-
... Show MoreAlzheimer’s disease (AD) is an age-related progressive and neurodegenerative disorder, which is characterized by loss of memory and cognitive decline. It is the main cause of disability among older people. The rapid increase in the number of people living with AD and other forms of dementia due to the aging population represents a major challenge to health and social care systems worldwide. Degeneration of brain cells due to AD starts many years before the clinical manifestations become clear. Early diagnosis of AD will contribute to the development of effective treatments that could slow, stop, or prevent significant cognitive decline. Consequently, early diagnosis of AD may also be valuable in detecting patients with dementia who have n
... Show MoreThis study assessed the advantage of using earthworms in combination with punch waste and nutrients in remediating drill cuttings contaminated with hydrocarbons. Analyses were performed on day 0, 7, 14, 21, and 28 of the experiment. Two hydrocarbon concentrations were used (20000 mg/kg and 40000 mg/kg) for three groups of earthworms number which were five, ten and twenty earthworms. After 28 days, the total petroleum hydrocarbon (TPH) concentration (20000 mg/kg) was reduced to 13200 mg/kg, 9800 mg/kg, and 6300 mg/kg in treatments with five, ten and twenty earthworms respectively. Also, TPH concentration (40000 mg/kg) was reduced to 22000 mg/kg, 10100 mg/kg, and 4200 mg/kg in treatments with the above number of earthworms respectively. The p
... Show MoreImage retrieval is used in searching for images from images database. In this paper, content – based image retrieval (CBIR) using four feature extraction techniques has been achieved. The four techniques are colored histogram features technique, properties features technique, gray level co- occurrence matrix (GLCM) statistical features technique and hybrid technique. The features are extracted from the data base images and query (test) images in order to find the similarity measure. The similarity-based matching is very important in CBIR, so, three types of similarity measure are used, normalized Mahalanobis distance, Euclidean distance and Manhattan distance. A comparison between them has been implemented. From the results, it is conclud
... Show MorePlagiarism is becoming more of a problem in academics. It’s made worse by the ease with which a wide range of resources can be found on the internet, as well as the ease with which they can be copied and pasted. It is academic theft since the perpetrator has ”taken” and presented the work of others as his or her own. Manual detection of plagiarism by a human being is difficult, imprecise, and time-consuming because it is difficult for anyone to compare their work to current data. Plagiarism is a big problem in higher education, and it can happen on any topic. Plagiarism detection has been studied in many scientific articles, and methods for recognition have been created utilizing the Plagiarism analysis, Authorship identification, and
... Show More