Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Wed Jul 01 2020

Journal Name

2020 42nd Annual International Conference Of The Ieee Engineering In Medicine & Biology Society (embc)

Recurrent Fusion of Time-Domain Descriptors Improves EMG-based Hand Movement Recognition

Ahmed A.

Rami N.

Ali H.

Adel

...Show More Authors

View Publication

(1)

Publication Date

Wed Sep 07 2022

Journal Name

2022 Iraqi International Conference On Communication And Information Technologies (iiccit)

Construct an Efficient DDoS Attack Detection System Based on RF-C4.5-GridSearchCV

Dhurgham

Amer

...Show More Authors

View Publication

(5)

(3)

Publication Date

Sat Oct 01 2022

Journal Name

Baghdad Science Journal

Human Face Recognition Based on Local Ternary Pattern and Singular Value Decomposition

Face Recognition

Image Processing

Local Ternary Pattern

Neural Network

Singular Values Decomposition

Ali Nadhim

Rozaida

Nidhal K.

Hussein Ali Hussein

...Show More Authors

There is various human biometrics used nowadays, one of the most important of these biometrics is the face. Many techniques have been suggested for face recognition, but they still face a variety of challenges for recognizing faces in images captured in the uncontrolled environment, and for real-life applications. Some of these challenges are pose variation, occlusion, facial expression, illumination, bad lighting, and image quality. New techniques are updating continuously. In this paper, the singular value decomposition is used to extract the features matrix for face recognition and classification. The input color image is converted into a grayscale image and then transformed into a local ternary pattern before splitting the image into

View Publication Preview PDF

(6)

(1)

Publication Date

Mon Aug 01 2022

Journal Name

Journal Of Molecular Liquids

Study to amino acid-based inhibitors as an effective anti-corrosion material

Nassar M.

Haider

...Show More Authors

The inhibitory behavior of L-Cysteine (Cys) and its derivatives towards iron corrosion through density functional theory (DFT) was investigated. The current research study undertakes a rigorous evaluation of global as well as local reactivity descriptors of the Cys in protonated as well as neutral forms and the changes in reactivity after the combination of Cys into di- and tripeptides. The inhibitory effect of di- and tri-peptides increases since, in the molecular structure, the number of reaction centers increase. We computed the adsorption energies (Eads) and low energy complexes with most stability for the adsorption of small peptides and Cys amino acids onto the surfaces of Fe (1 1 1). We found that the adsorption of tri-peptides onto

View Publication

(28)

(21)

Publication Date

Sun Dec 31 2023

Journal Name

Sumer Journal For Pure Science

COVID-19Disease Diagnosis using Artificial Intelligence based on Gene Expression: A Review

Qusay

Sanaa

Ekhlas

Wasan A.Wahab

...Show More Authors

Publication Date

Thu May 18 2023

Journal Name

Journal Of Engineering

A Modified Strength Pareto Evolutionary Algorithm 2 based Environmental /Economic Power Dispatch

Genetic algorithm

multi-objectives optimization

power generation dispatch

power generation economic

pareto distributions.

Hassan Abdullah

Saif Sabah

...Show More Authors

A Strength Pareto Evolutionary Algorithm 2 (SPEA 2) approach for solving the multi-objective Environmental / Economic Power Dispatch (EEPD) problem is presented in this paper. In the past fuel cost consumption minimization was the aim (a single objective function) of economic power dispatch problem. Since the clean air act amendments have been applied to reduce SO2 and NOX emissions from power plants, the utilities change their strategies in order to reduce pollution and atmospheric emission as well, adding emission minimization as other objective function made economic power dispatch (EPD) a multi-objective problem having conflicting objectives. SPEA2 is the improved version of SPEA with better fitness assignment, density estimation, an

View Publication Preview PDF

Publication Date

Sun Dec 31 2023

Journal Name

Iraqi Journal Of Information And Communication Technology

EEG Signal Classification Based on Orthogonal Polynomials, Sparse Filter and SVM Classifier

EEG

Orthogonal Polynomials

Sparse Filter

SVM

Tchebichef

Krawtchouk

Hayder S. Radeaf

Mohammed Z.

...Show More Authors

This work implements an Electroencephalogram (EEG) signal classifier. The implemented method uses Orthogonal Polynomials (OP) to convert the EEG signal samples to moments. A Sparse Filter (SF) reduces the number of converted moments to increase the classification accuracy. A Support Vector Machine (SVM) is used to classify the reduced moments between two classes. The proposed method’s performance is tested and compared with two methods by using two datasets. The datasets are divided into 80% for training and 20% for testing, with 5 -fold used for cross-validation. The results show that this method overcomes the accuracy of other methods. The proposed method’s best accuracy is 95.6% and 99.5%, respectively. Finally, from the results, it

View Publication Preview PDF

(4)

Publication Date

Tue Dec 01 2009

Journal Name

Journal Of Lightwave Technology

A Random Number Generator Based on Single-Photon Avalanche Photodiode Dark Counts

Dark counts

quantum cryptography

random number.

Shelan

...Show More Authors

View Publication

(26)

(24)

Publication Date

Fri Apr 30 2021

Journal Name

International Journal Of Intelligent Engineering And Systems

SMS Spam Detection Based on Fuzzy Rules and Binary Particle Swarm Optimization

Sarab

...Show More Authors

View Publication

(12)

(6)

Publication Date

Thu Oct 01 2020

Journal Name

Engineering Science And Technology, An International Journal

Thermal performance improvement based on the hybrid design of a heat sink

jasim H h

...Show More Authors

View Publication

(4)

(2)

1 2 ... 91 92 93 94 ... 721 722