Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Fri Jan 01 2010

Journal Name

Thesis

Design and Implementation proposed Encoding and Hiding Text in an Image

Cryptography

RSA

Steganography

Digital signature

Nada Abdul Aziz

...Show More Authors

NAA Mustafa, University of Sulaimani, Ms. c Thesis, 2010 - Cited by 4

View Publication

Publication Date

Sat Sep 06 2025

Journal Name

Mesopotamian Journal Of Cybersecurity

Multilevel Text Protection System Using AES and DWT-DCT-SVD Techniques

AES-GCM DWT DCT SVD Zigzag

Nada

...Show More Authors

In the digital age, protecting intellectual property and sensitive information against unauthorized access is of paramount importance. While encryption helps keep data private and steganography hides the fact that data are present, using both together makes the security much stronger. This paper introduces a new way to hide encrypted text inside color images by integrating discrete wavelet transform (DWT), discrete cosine transform (DCT), and singular value decomposition (SVD), along with AES-GCM encryption, to guarantee data integrity and authenticity. The proposed method operates in the YCbCr color space, targeting the luminance (Y) channel to preserve perceptual quality. Embedding is performed within the HL subband obtained from DWT deco

View Publication

Publication Date

Tue Jun 16 2026

Journal Name

Journal Of The College Of Education

A New Text from Ur III dynasty on Ba’aga, the fattener

Ur III dynasty

Sumerian texts

Ba’aga

Iri-sag̃rig

Iraq museum

وفاء هادي

...Show More Authors

يعد هذا النص أحد النصوص المسمارية المصادرة التي بحوزة المتحف العراقي، ويحمل الرقم المتحفي (235869)، قياساته )12،7x 6x 2،5سم). يتضمن مدخولات كميات من الشعير،أرخ النص الى عصر أور الثالثة (2012-2004 ق.م) و يعود الى السنة الثالثة من حكم الملك أبي-سين (2028-2004 ق.م)،أن الشخصية الرئيسة في هذا النص هو)با-اَ-كا مسمن الماشية( من مدينة أري-ساكرك، ومقارنته مع النصوص المسمارية المنشورة التي تعود الى أرشيفه يبلغ عددها (196) نصاً تضمنت نشاطاته م

View Publication Preview PDF

Publication Date

Wed Dec 14 2022

Journal Name

Nasaq Journal

The Effect of Co-text on the Comprehensibility of World Englishes

Majid Rasim

...Show More Authors

MR Younus, Nasaq Journal, 2022

View Publication

Publication Date

Mon Aug 01 2016

Journal Name

Ieee Transactions On Neural Systems And Rehabilitation Engineering

Transradial Amputee Gesture Classification Using an Optimal Number of sEMG Sensors: An Approach Using ICA Clustering

Ganesh R.

Ali H.

Hung T.

...Show More Authors

View Publication

(145)

Publication Date

Sun Feb 01 2015

Journal Name

The European Physical Journal A

Analytic view at alpha clustering in even-even heavy nuclei near magic numbers 82 and 126

Saad M. Saleh

Redzuwan

Shahidan

Muhamad Samudi

Hasan Abu

Mayeen Uddin

...Show More Authors

View Publication

(17)

(19)

Publication Date

Fri Jan 01 2016

Journal Name

Machine Learning And Data Mining In Pattern Recognition

A New Strategy for Case-Based Reasoning Retrieval Using Classification Based on Association

Ahmed

...Show More Authors

View Publication Preview PDF

(7)

(5)

Publication Date

Sun Apr 17 2016

Journal Name

Al-academy

The optical text in the cinema film Between the presence and absence Play for free mark: حسام الدين محمد عبد المنعم

HussamulDeen

...Show More Authors

The light-based life in the universe, including the human derived concepts and meanings of the fear of darkness and evil, comfort and goodness of light, became constitute bilateral haunted man to this day in various concepts of life.Therefore reflect the light form artistic aesthetic in visual arts such arts Fine Photography and the other until the emergence of art cinema, as the use of lighting in cinema has produced high-energy in the composition of its values expressive and symbolic, where it became dark and light are the space visually moving the vehicles media kit for many within the work structure artwork. The research is divided into five chapters, the first chapter (the methodological framework) that included an introduction the

View Publication Preview PDF

Publication Date

Tue Jun 16 2026

Journal Name

Journal Of Language Studies

Opening the Box of Suffering, Unleashing the Evils of the World’: Pandora and her Representation in Nineteenth-Century American Poetry

Zaid

Sabah

...Show More Authors

View Publication

Publication Date

Tue Jun 30 2015

Journal Name

International Journal Of Computer Techniques

Multifractal-Based Features for Medical Images Classification

Saad

Loay E.

Raid Kamil

...Show More Authors

This paper presents a method to classify colored textural images of skin tissues. Since medical images havehighly heterogeneity, the development of reliable skin-cancer detection process is difficult, and a mono fractaldimension is not sufficient to classify images of this nature. A multifractal-based feature vectors are suggested hereas an alternative and more effective tool. At the same time multiple color channels are used to get more descriptivefeatures.Two multifractal based set of features are suggested here. The first set measures the local roughness property, whilethe second set measure the local contrast property.A combination of all the extracted features from the three colormodels gives a highest classification accuracy with 99.4

Preview PDF

1 2 ... 16 17 18 19 ... 723 724