Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
Abstract
This study investigates the mechanical compression properties of tin-lead and lead-free alloy spherical balls, using more than 500 samples to identify statistical variability in the properties in each alloy. Isothermal aging was done to study and compare the aging effect on the microstructure and properties.
The results showed significant elastic and plastic anisotropy of tin phase in lead-free tin based solder and that was compared with simulation using a Crystal Plasticity Finite Element (CPEF) method that has the anisotropy of Sn installed. The results and experiments were in good agreement, indicating the range of values expected with anisotropic properties.
Keywords<
... Show MoreThe present art icle discusses the prob lems of understanding and translating the lingu istic and cult ural aspect of a foreign lite rary text. The article considers the trans lation process through the pr ism of cult ural orientation. In the process of transl ation, the nati onal cultural iden tity should be expressed to the max imum extent, through all me ans of expre ssion that include imagery and inton ation. In addi tion to the author's sty le, special atte ntion should al so be pa id to tro pes, phraseological uni ts, colloquial wo rds and dial&n
... Show MoreThe research aims to analysis of the current financial crisis in Iraq through knowing its causes and then propose some solutions that help in remedy the crisis and that on the level of expenditures and revenues, and has been relying on the Federal general budget law of the Republic of Iraq for the fiscal year 2016 to obtain the necessary data in respect of the current expenditures and revenues which necessary to achieve the objective of the research , and through the research results has been reached to a set of conclusions which the most important of them that causes of the current financial crisis in Iraq , mainly belonging to increased expenditures and especially the current ones and the lack of revenues , especially non-oil o
... Show MoreSince the introduction of the HTTP/3, research has focused on evaluating its influences on the existing adaptive streaming over HTTP (HAS). Among these research, due to irrelevant transport protocols, the cross-protocol unfairness between the HAS over HTTP/3 (HAS/3) and HAS over HTTP/2 (HAS/2) has caught considerable attention. It has been found that the HAS/3 clients tend to request higher bitrates than the HAS/2 clients because the transport QUIC obtains higher bandwidth for its HAS/3 clients than the TCP for its HAS/2 clients. As the problem originates from the transport layer, it is likely that the server-based unfairness solutions can help the clients overcome such a problem. Therefore, in this paper, an experimental study of the se
... Show MoreGender classification is a critical task in computer vision. This task holds substantial importance in various domains, including surveillance, marketing, and human-computer interaction. In this work, the face gender classification model proposed consists of three main phases: the first phase involves applying the Viola-Jones algorithm to detect facial images, which includes four steps: 1) Haar-like features, 2) Integral Image, 3) Adaboost Learning, and 4) Cascade Classifier. In the second phase, four pre-processing operations are employed, namely cropping, resizing, converting the image from(RGB) Color Space to (LAB) color space, and enhancing the images using (HE, CLAHE). The final phase involves utilizing Transfer lea
... Show MoreContent-based image retrieval has been keenly developed in numerous fields. This provides more active management and retrieval of images than the keyword-based method. So the content based image retrieval becomes one of the liveliest researches in the past few years. In a given set of objects, the retrieval of information suggests solutions to search for those in response to a particular description. The set of objects which can be considered are documents, images, videos, or sounds. This paper proposes a method to retrieve a multi-view face from a large face database according to color and texture attributes. Some of the features used for retrieval are color attributes such as the mean, the variance, and the color image's bitmap. In add
... Show More