Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Wed Aug 30 2023

Journal Name

Mathematical Modelling Of Engineering Problems

Dynamic Low Power Clustering Strategy in MWSN

Iman Ameer

Muna Mohammed Jawad

Ali M.

...Show More Authors

View Publication

(5)

(1)

Publication Date

Sat Dec 31 2022

Journal Name

Wasit Journal Of Computer And Mathematics Science

An Improved Method for Hiding Text in Image Using Header Image

Nada abdul aziz

...Show More Authors

The necessities of steganography methods for hiding secret message into images have been ascend. Thereby, this study is to generate a practical steganography procedure to hide text into image. This operation allows the user to provide the system with both text and cover image, and to find a resulting image that comprises the hidden text inside. The suggested technique is to hide a text inside the header formats of a digital image. Least Significant Bit (LSB) method to hide the message or text, in order to keep the features and characteristics of the original image are used. A new method is applied via using the whole image (header formats) to hide the image. From the experimental results, suggested technique that gives a higher embe

View Publication

Publication Date

Wed May 03 2017

Journal Name

Arab Conferences Network / American Research Foundation

Understanding the nature of science among chemistry teachers according to the AAAS document for the Educational Reform Project 2061

Basama

...Show More Authors

Preview PDF

Publication Date

Sat Sep 30 2023

Journal Name

Journal Of The College Of Education For Women

The Prophetic Speeches (Hadith) on Sciences and Scientists: Application of the "Text from Text and D+" Theory

Discourse analysis

Prophetic Hadith

Symmetry line

Text-from-text and D Theory.

Aiman Eid Al-Rawajfeh

Basem Jawabreh

Manal Ahmad

...Show More Authors

This study aims to apply the theory of "Text from Text and the Plus Dimension" in the analysis of the Prophetic discourse found in the section on the virtues of knowledge and scholars from Imam Sahih al-Bukhari's book. This section covers several topics, including the virtue of gathering for the sake of learning, the superiority of a scholar over a worshipper, the excellence of jurisprudence in the religion of Allah, the acquisition of knowledge through the passing away of scholars, the merit of inviting people to Allah, the continuing benefit of beneficial knowledge after a scholar's demise, the warning against seeking knowledge for purposes other than Allah, and the Prophet seeking refuge from knowledge tha

View Publication Preview PDF

(1)

Publication Date

Wed Jun 11 2025

Journal Name

Iraqi Journal For Computer Science And Mathematics

Topological Indices for the Resize Graph of (G<sub>2</sub>(3))

Manar Musab

Ali Abd

...Show More Authors

Indexes of topological play a crucial role in mathematical chemistry and network theory, providing valuable insights into the structural properties of graphs. In this study, we investigate the Resize graph of G2(3), a significant algebraic structure arising from the exceptional Lie group (G2) over the finite field F3. We compute several well-known topological indices, including the Zagreb indices, Wiener index, and Randić index, to analyze the graph's connectivity and complexity. Our results reveal intricate relationships between the algebraic structure of G2(3) and its graphical properties, offering a deeper understanding of its combinatorial and spectral characteristics. These findings contribute to the broader study of algebraic graph t

View Publication

Publication Date

Tue Jan 01 2019

Journal Name

The International Journal Of Literary Humanities

The Stereotypical Representation of Black Women in Caryl Phillips’ "Cambridge"

Azhar Noori

...Show More Authors

View Publication

(1)

Publication Date

Thu Mar 31 2022

Journal Name

Journal Of The College Of Education For Women

A Pragmatic Study of Identity Representation in American Political Speeches

American politics

identity

political speeches

pragmatic strategies

Baidaa Hasan

Wafaa Sahib

...Show More Authors

Identity is an influential and flexible concept in social sciences and political studies. The basic sense of identity is looking for uniqueness. In one sense, it is a sign of identification with those we assume they are similar to us or at least in some significant ways they are so. Globalization, migration, modern technologies, media and political conflicts are argued to have a crucial effect on identity representation in terms of the political perspectives specifically in the United States of America. This paper endeavors to investigate how American politicians represent their identities in speeches delivered in different periods of time namely from 2015 to 2018 in terms of the pragmatic paradigm. Three randomly selected speeches by fa

View Publication Preview PDF

(1)

Publication Date

Sun Dec 02 2018

Journal Name

Journal Of The College Of Education For Women

Structuralism and the Problem of Text

Structuralism

Text

Critical approaches

Structural approach.

Hayder Fadhil

...Show More Authors

All modern critical approaches attempt to cover the meanings and overtones of the text, claiming that they are better than others in the analysis and attainment of the intended meanings of the text. The structural approach claims to be able to do so more than any other modern critical approach, as it claimed that it is possible to separate what is read from the reader, on the presumed belief that it is possible to read the text with a zero-memory. However, the studies in criticism of criticism state that each of these approaches is successful in dealing with the text in one or more aspects while failing in one or more aspects. Consequently, the criticism whether the approach possesses the text, or that the text rejects this possession, r

View Publication Preview PDF

Publication Date

Wed Aug 27 2025

Journal Name

Baghdad Science Journal

A Clustering Technique Based on the Hard K-Means (H.KM.) Method to Determine the Governorate That Have More Influence for Spreading COVID-19 in the Kingdom of Saudi Arabia

Rand Muhaned

Wurood R. Abd

Iden Hassan

...Show More Authors

View Publication

Publication Date

Tue Mar 19 2013

Journal Name

Journal Of Semitic Studies

An Aramaic Incantation Text

Bahaa

...Show More Authors

View Publication

(3)

1 2 ... 9 10 11 12 ... 726 727