Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
Abstract
The curriculum is the major effective tool in achieving the goals of
education and society.
Many countries that want to reach the forefront of developed countries
through their curriculum have realized this fact. School text book, the
application assessment for knowing the rang of success or fail of this text
book in achieving the general aims. therefore this study aims at assessing the
principals and techniques of geography text book for fourth secondary class of
literary studying from the teachers point of view according to the fields of the
book, style of material, technical arrangement of the material, ethnical
arrangement the language of the book, style of the material, technical
arrang
Abstract
This research aims to study the reflection of accounting for contingent assets and liabilities and provisions on Faithful Representation characteristic of accounting information, To achieve this goal has been questionnaire design has been distributed to research sample, which consists of (50) li
... Show MoreLet be a non-trivial simple graph. A dominating set in a graph is a set of vertices such that every vertex not in the set is adjacent to at least one vertex in the set. A subset is a minimum neighborhood dominating set if is a dominating set and if for every holds. The minimum cardinality of the minimum neighborhood dominating set of a graph is called as minimum neighborhood dominating number and it is denoted by . A minimum neighborhood dominating set is a dominating set where the intersection of the neighborhoods of all vertices in the set is as small as possible, (i.e., ). The minimum neighborhood dominating number, denoted by , is the minimum cardinality of a minimum neighborhood dominating set. In other words, it is the
... Show MoreIn this paper, some commonly used hierarchical cluster techniques have been compared. A comparison was made between the agglomerative hierarchical clustering technique and the k-means technique, which includes the k-mean technique, the variant K-means technique, and the bisecting K-means, although the hierarchical cluster technique is considered to be one of the best clustering methods. It has a limited usage due to the time complexity. The results, which are calculated based on the analysis of the characteristics of the cluster algorithms and the nature of the data, showed that the bisecting K-means technique is the best compared to the rest of the other methods used.
As s widely use of exchanging private information in various communication applications, the issue to secure it became top urgent. In this research, a new approach to encrypt text message based on genetic algorithm operators has been proposed. The proposed approach follows a new algorithm of generating 8 bit chromosome to encrypt plain text after selecting randomly crossover point. The resulted child code is flipped by one bit using mutation operation. Two simulations are conducted to evaluate the performance of the proposed approach including execution time of encryption/decryption and throughput computations. Simulations results prove the robustness of the proposed approach to produce better performance for all evaluation metrics with res
... Show MoreDoubts arise about the originality of a document when noticing a change in its writing style. This evidence to plagiarism has made the intrinsic approach for detecting plagiarism uncover the plagiarized passages through the analysis of the writing style for the suspicious document where a reference corpus to compare with is absent. The proposed work aims at discovering the deviations in document writing style through applying several steps: Firstly, the entire document is segmented into disjointed segments wherein each corresponds to a paragraph in the original document. For the entire document and for each segment, center vectors comprising average weight of their word are constructed. Second, the degree of cl
... Show MoreThe purpose of this resesrh know (the effectiveness of cooperative lerarning implementation of floral material for calligraphy and ornamentation) To achieve the aim of the research scholar put the two zeros hypotheses: in light of the findings of the present research the researcher concluded a number of conclusions, including: -
1 - Sum strategy helps the learner to be positive in all the information and regulations, monitoring and evaluation during the learning process.
2 - This strategy helps the learner to use information and knowledge and their use in various educational positions, and to achieve better education to increase its ability to develop thinking skills and positive trends towards the article.
In light of this, the
The cartographic representation of geographical phenomenon considers the essential base in geographical analysis since as it supports the vertical house which is apartment houses consist of apartments which represent civil phenomenon , the aim of this study is to project the rule of cartographic representation methods in geographical analysis and make comparison in economic and social aspects for two approaches , the vertical and construction building methods.
The expand of city horizontally represent a problem in itself because it leads to loss of civil lands and overpass the agricultural lands as a result ,the ratio of habitation of land usage is so large in comparison with another usages of lands of the cities ; therefore ; many co