Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
In this work, a deep computational study has been conducted to assign several qualities for the graph . Furthermore, determine the amount of the dihedral subgroups in the Held simple group He through utilizing the attributes of gamma.
Steganography can be defined as the art and science of hiding information in the data that could be read by computer. This science cannot recognize stego-cover and the original one whether by eye or by computer when seeing the statistical samples. This paper presents a new method to hide text in text characters. The systematic method uses the structure of invisible character to hide and extract secret texts. The creation of secret message comprises four main stages such using the letter from the original message, selecting the suitable cover text, dividing the cover text into blocks, hiding the secret text using the invisible character and comparing the cover-text and stego-object. This study uses an invisible character (white space
... Show MoreIn this thesis, we study the topological structure in graph theory and various related results. Chapter one, contains fundamental concept of topology and basic definitions about near open sets and give an account of uncertainty rough sets theories also, we introduce the concepts of graph theory. Chapter two, deals with main concepts concerning topological structures using mixed degree systems in graph theory, which is M-space by using the mixed degree systems. In addition, the m-derived graphs, m-open graphs, m-closed graphs, m-interior operators, m-closure operators and M-subspace are defined and studied. In chapter three we study supra-approximation spaces using mixed degree systems and primary object in this chapter are two topological
... Show MoreThe data communication has been growing in present day. Therefore, the data encryption became very essential in secured data transmission and storage and protecting data contents from intruder and unauthorized persons. In this paper, a fast technique for text encryption depending on genetic algorithm is presented. The encryption approach is achieved by the genetic operators Crossover and mutation. The encryption proposal technique based on dividing the plain text characters into pairs, and applying the crossover operation between them, followed by the mutation operation to get the encrypted text. The experimental results show that the proposal provides an important improvement in encryption rate with comparatively high-speed Process
... Show MoreThis research attempted to take advantage of modern techniques in the study of the superstructural phonetic features of spoken text in language using phonetic programs to achieve more accurate and objective results, far from being limited to self-perception and personal judgment, which varies from person to person.
It should be noted that these phonological features (Nabr, waqf, toning) are performance controls that determine the fate of the meaning of the word or sentence, but in the modern era has received little attention and attention, and that little attention to some of them came to study issues related to the composition or style Therefore, we recommend that more attention should be given to the study of
Building a system to identify individuals through their speech recording can find its application in diverse areas, such as telephone shopping, voice mail and security control. However, building such systems is a tricky task because of the vast range of differences in the human voice. Thus, selecting strong features becomes very crucial for the recognition system. Therefore, a speaker recognition system based on new spin-image descriptors (SISR) is proposed in this paper. In the proposed system, circular windows (spins) are extracted from the frequency domain of the spectrogram image of the sound, and then a run length matrix is built for each spin, to work as a base for feature extraction tasks. Five different descriptors are generated fro
... Show MoreLet be a non-trivial simple graph. A dominating set in a graph is a set of vertices such that every vertex not in the set is adjacent to at least one vertex in the set. A subset is a minimum neighborhood dominating set if is a dominating set and if for every holds. The minimum cardinality of the minimum neighborhood dominating set of a graph is called as minimum neighborhood dominating number and it is denoted by . A minimum neighborhood dominating set is a dominating set where the intersection of the neighborhoods of all vertices in the set is as small as possible, (i.e., ). The minimum neighborhood dominating number, denoted by , is the minimum cardinality of a minimum neighborhood dominating set. In other words, it is the
... Show MoreDoubts arise about the originality of a document when noticing a change in its writing style. This evidence to plagiarism has made the intrinsic approach for detecting plagiarism uncover the plagiarized passages through the analysis of the writing style for the suspicious document where a reference corpus to compare with is absent. The proposed work aims at discovering the deviations in document writing style through applying several steps: Firstly, the entire document is segmented into disjointed segments wherein each corresponds to a paragraph in the original document. For the entire document and for each segment, center vectors comprising average weight of their word are constructed. Second, the degree of cl
... Show More