Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
Intrusion detection systems (IDS) are useful tools that help security administrators in the developing task to secure the network and alert in any possible harmful event. IDS can be classified either as misuse or anomaly, depending on the detection methodology. Where Misuse IDS can recognize the known attack based on their signatures, the main disadvantage of these systems is that they cannot detect new attacks. At the same time, the anomaly IDS depends on normal behaviour, where the main advantage of this system is its ability to discover new attacks. On the other hand, the main drawback of anomaly IDS is high false alarm rate results. Therefore, a hybrid IDS is a combination of misuse and anomaly and acts as a solution to overcome the dis
... Show MoreThis study was conducted to determine the relationship between test anxiety and cognitive representation among university students. To this end, 152 student (male, female) were chosen randomly from scientific and social departments to fill out the questionnaires of test anxiety and cognitive representation. The researcher utilized Independent Samples T-Test, Pearson product-moment correlation coefficient, Cronbach's alpha and T-Test in his study. The result revealed that there were negative and a weak correlation between test anxiety and cognitive representation among university students.
An automatic text summarization system mimics how humans summarize by picking the most significant sentences in a source text. However, the complexities of the Arabic language have become challenging to obtain information quickly and effectively. The main disadvantage of the traditional approaches is that they are strictly constrained (especially for the Arabic language) by the accuracy of sentence feature functions, weighting schemes, and similarity calculations. On the other hand, the meta-heuristic search approaches have a feature tha
... Show More
The great scientific progress has led to widespread Information as information accumulates in large databases is important in trying to revise and compile this vast amount of data and, where its purpose to extract hidden information or classified data under their relations with each other in order to take advantage of them for technical purposes.
And work with data mining (DM) is appropriate in this area because of the importance of research in the (K-Means) algorithm for clustering data in fact applied with effect can be observed in variables by changing the sample size (n) and the number of clusters (K)
... Show MoreIn this thesis, we study the topological structure in graph theory and various related results. Chapter one, contains fundamental concept of topology and basic definitions about near open sets and give an account of uncertainty rough sets theories also, we introduce the concepts of graph theory. Chapter two, deals with main concepts concerning topological structures using mixed degree systems in graph theory, which is M-space by using the mixed degree systems. In addition, the m-derived graphs, m-open graphs, m-closed graphs, m-interior operators, m-closure operators and M-subspace are defined and studied. In chapter three we study supra-approximation spaces using mixed degree systems and primary object in this chapter are two topological
... Show MoreBased on the German language department’s theoretical and practical aspects as well as educational programs, the present study discusses the semantic relations in text sentences and their role in the science of translation. Through clarifying the semantic relationship between the text sentence and the methods used to express a news item, a situation or an occurrence and through the statement of the multiple theoretical semantic structures of the text’s construction and interrelation, a translator can easily translate a text into the target language.
It is known that language learners face multiple difficulties in writing and creating an inte
... Show MoreLet be a non-trivial simple graph. A dominating set in a graph is a set of vertices such that every vertex not in the set is adjacent to at least one vertex in the set. A subset is a minimum neighborhood dominating set if is a dominating set and if for every holds. The minimum cardinality of the minimum neighborhood dominating set of a graph is called as minimum neighborhood dominating number and it is denoted by . A minimum neighborhood dominating set is a dominating set where the intersection of the neighborhoods of all vertices in the set is as small as possible, (i.e., ). The minimum neighborhood dominating number, denoted by , is the minimum cardinality of a minimum neighborhood dominating set. In other words, it is the
... Show More