Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
The cartographic representation of geographical phenomenon considers the essential base in geographical analysis since as it supports the vertical house which is apartment houses consist of apartments which represent civil phenomenon , the aim of this study is to project the rule of cartographic representation methods in geographical analysis and make comparison in economic and social aspects for two approaches , the vertical and construction building methods.
The expand of city horizontally represent a problem in itself because it leads to loss of civil lands and overpass the agricultural lands as a result ,the ratio of habitation of land usage is so large in comparison with another usages of lands of the cities ; therefore ; many co
Wireless sensor applications are susceptible to energy constraints. Most of the energy is consumed in communication between wireless nodes. Clustering and data aggregation are the two widely used strategies for reducing energy usage and increasing the lifetime of wireless sensor networks. In target tracking applications, large amount of redundant data is produced regularly. Hence, deployment of effective data aggregation schemes is vital to eliminate data redundancy. This work aims to conduct a comparative study of various research approaches that employ clustering techniques for efficiently aggregating data in target tracking applications as selection of an appropriate clustering algorithm may reflect positive results in the data aggregati
... Show MoreLoanwords are the words transferred from one language to another, which become essential part of the borrowing language. The loanwords have come from the source language to the recipient language because of many reasons. Detecting these loanwords is complicated task due to that there are no standard specifications for transferring words between languages and hence low accuracy. This work tries to enhance this accuracy of detecting loanwords between Turkish and Arabic language as a case study. In this paper, the proposed system contributes to find all possible loanwords using any set of characters either alphabetically or randomly arranged. Then, it processes the distortion in the pronunciation, and solves the problem of the missing lette
... Show MoreThis paper is concerned with introducing and studying the o-space by using out degree system (resp. i-space by using in degree system) which are the core concept in this paper. In addition, the m-lower approximations, the m-upper approximations and ospace and i-space. Furthermore, we introduce near supraopen (near supraclosed) d. g.'s. Finally, the supra-lower approximation, supraupper approximation, supra-accuracy are defined and some of its properties are investigated.
Identity is an influential and flexible concept in social sciences and political studies. The basic sense of identity is looking for uniqueness. In one sense, it is a sign of identification with those we assume they are similar to us or at least in some significant ways they are so. Globalization, migration, modern technologies, media and political conflicts are argued to have a crucial effect on identity representation in terms of the political perspectives specifically in the United States of America. This paper endeavors to investigate how American politicians represent their identities in speeches delivered in different periods of time namely from 2015 to 2018 in terms of the pragmatic paradigm. Three randomly selected speeches by fa
... Show MoreThe necessities of steganography methods for hiding secret message into images have been ascend. Thereby, this study is to generate a practical steganography procedure to hide text into image. This operation allows the user to provide the system with both text and cover image, and to find a resulting image that comprises the hidden text inside. The suggested technique is to hide a text inside the header formats of a digital image. Least Significant Bit (LSB) method to hide the message or text, in order to keep the features and characteristics of the original image are used. A new method is applied via using the whole image (header formats) to hide the image. From the experimental results, suggested technique that gives a higher embe
... Show More