Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Sat Oct 01 2022

Journal Name

Therapeutic Delivery

Particles-based Medicated Wound Dressings: A Comprehensive Review

Kawther K

Amaraporn

...Show More Authors

View Publication

(3)

(2)

Publication Date

Sun Jun 01 2014

Journal Name

Baghdad Science Journal

Classification of fetal abnormalities based on CTG signal

fetal heart rate monitoring

heart rate analysis by neural network

fuzzy classification

FHR wavelet transform.

Safa'a S.

Israa R.

...Show More Authors

The fetal heart rate (FHR) signal processing based on Artificial Neural Networks (ANN),Fuzzy Logic (FL) and frequency domain Discrete Wavelet Transform(DWT) were analysis in order to perform automatic analysis using personal computers. Cardiotocography (CTG) is a primary biophysical method of fetal monitoring. The assessment of the printed CTG traces was based on the visual analysis of patterns that describing the variability of fetal heart rate signal. Fetal heart rate data of pregnant women with pregnancy between 38 and 40 weeks of gestation were studied. The first stage in the system was to convert the cardiotocograghy (CTG) tracing in to digital series so that the system can be analyzed ,while the second stage ,the FHR time series was t

View Publication Preview PDF

Publication Date

Fri Mar 01 2019

Journal Name

Al-khwarizmi Engineering Journal

COMPUTER-BASED ECG SIGNAL ANALYSIS AND MONITORING SYSTEM

Hadeel Kassim

Nasser N.

...Show More Authors

This paper deals with the design and implementation of an ECG system. The proposed system gives a new concept of ECG signal manipulation, storing, and editing. It consists mainly of hardware circuits and the related software. The hardware includes the circuits of ECG signals capturing, and system interfaces. The software is written using Visual Basic languages, to perform the task of identification of the ECG signal. The main advantage of the system is to provide a reported ECG recording on a personal computer, so that it can be stored and processed at any time as required. This system was tested for different ECG signals, some of them are abnormal and the other is normal, and the results show that the system has a good quality of diagno

View Publication Preview PDF

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Engineering

GNSS Baseline Configuration Based on First Order Design

configuration baselines

FOD

GNSS network

A-optimality

E-optimality

Oday Yaseen

Muayed Yaseen

Zahraa Azeldeen

...Show More Authors

The quality of Global Navigation Satellite Systems (GNSS) networks are considerably influenced by the configuration of the observed baselines. Where, this study aims to find an optimal configuration for GNSS baselines in terms of the number and distribution of baselines to improve the quality criteria of the GNSS networks. First order design problem (FOD) was applied in this research to optimize GNSS network baselines configuration, and based on sequential adjustment method to solve its objective functions.

FOD for optimum precision (FOD-p) was the proposed model which based on the design criteria of A-optimality and E-optimality. These design criteria were selected as objective functions of precision, whic

View Publication

Publication Date

Wed Sep 01 2021

Journal Name

Baghdad Science Journal

Optimum Median Filter Based on Crow Optimization Algorithm

Image processing

Impulse noise

Noise removal

Optimum median filter

Crow optimization algorithm.

Basma Jumaa

Ahmed Yousif Falih

Ali Talib Qasim

Lamees abdalhasan

...Show More Authors

A novel median filter based on crow optimization algorithms (OMF) is suggested to reduce the random salt and pepper noise and improve the quality of the RGB-colored and gray images. The fundamental idea of the approach is that first, the crow optimization algorithm detects noise pixels, and that replacing them with an optimum median value depending on a criterion of maximization fitness function. Finally, the standard measure peak signal-to-noise ratio (PSNR), Structural Similarity, absolute square error and mean square error have been used to test the performance of suggested filters (original and improved median filter) used to removed noise from images. It achieves the simulation based on MATLAB R2019b and the resul

View Publication Preview PDF

(8)

(4)

Publication Date

Thu Aug 01 2019

Journal Name

2019 2nd International Conference On Engineering Technology And Its Applications (iiceta)

Human Gait Identification System Based on Average Silhouette

Mohanad Hazim Nsaif

Nawaf Hazim

Sinan Sameer Mahmood

...Show More Authors

View Publication

(2)

Publication Date

Mon Mar 01 2021

Journal Name

Iop Conference Series: Materials Science And Engineering

Speech Enhancement Algorithm Based on a Hybrid Estimator

Basheera M.

Sadiq H.

Marwah A.

Muntadher

Jamila

...Show More Authors

Abstract<p>Speech is the essential way to interact between humans or between human and machine. However, it is always contaminated with different types of environment noise. Therefore, speech enhancement algorithms (SEA) have appeared as a significant approach in speech processing filed to suppress background noise and return back the original speech signal. In this paper, a new efficient two-stage SEA with low distortion is proposed based on minimum mean square error sense. The estimation of clean signal is performed by taking the advantages of Laplacian speech and noise modeling based on orthogonal transform (Discrete Krawtchouk-Tchebichef transform) coefficients distribution. The Discrete Kra</p> ... Show More

View Publication

(12)

Publication Date

Thu Dec 01 2022

Journal Name

Al-khwarizmi Engineering Journal

BCI-Based Smart Room Control using EEG Signals

Oger Zaya

Yarub

...Show More Authors

In this paper, we implement and examine a Simulink model with electroencephalography (EEG) to control many actuators based on brain waves. This will be in great demand since it will be useful for certain individuals who are unable to access some control units that need direct contact with humans. In the beginning, ten volunteers of a wide range of (20-66) participated in this study, and the statistical measurements were first calculated for all eight channels. Then the number of channels was reduced by half according to the activation of brain regions within the utilized protocol and the processing time also decreased. Consequently, four of the participants (three males and one female) were chosen to examine the Simulink model duri

(2)

Publication Date

Mon May 15 2017

Journal Name

International Journal Of Image And Data Fusion

Image edge detection operators based on orthogonal polynomials

Sadiq H.

Abd. Rahman

Basheera M.

S.A.R.

Wissam A.

...Show More Authors

View Publication

(33)

(10)

Publication Date

Tue Jan 01 2013

Journal Name

International Journal Of Computer Applications

Content-based Image Retrieval (CBIR) using Hybrid Technique

CBIR

feature extraction

properties

color histogram

GLCM

hybrid

similarity measure

Zainab

Israa

Nabeel

...Show More Authors

Image retrieval is used in searching for images from images database. In this paper, content – based image retrieval (CBIR) using four feature extraction techniques has been achieved. The four techniques are colored histogram features technique, properties features technique, gray level co- occurrence matrix (GLCM) statistical features technique and hybrid technique. The features are extracted from the data base images and query (test) images in order to find the similarity measure. The similarity-based matching is very important in CBIR, so, three types of similarity measure are used, normalized Mahalanobis distance, Euclidean distance and Manhattan distance. A comparison between them has been implemented. From the results, it is conclud

View Publication

1 2 ... 64 65 66 67 ... 720 721