Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

3

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Sun Jun 02 2013

Journal Name

Baghdad Science Journal

Comparison of Maximum Likelihood and some Bayes Estimators for Maxwell Distribution based on Non-informative Priors

Tasnim Hasan Kadhim

...Show More Authors

In this paper, Bayes estimators of the parameter of Maxwell distribution have been derived along with maximum likelihood estimator. The non-informative priors; Jeffreys and the extension of Jeffreys prior information has been considered under two different loss functions, the squared error loss function and the modified squared error loss function for comparison purpose. A simulation study has been developed in order to gain an insight into the performance on small, moderate and large samples. The performance of these estimators has been explored numerically under different conditions. The efficiency for the estimators was compared according to the mean square error MSE. The results of comparison by MSE show that the efficiency of B

Preview PDF

(1)

Publication Date

Mon Oct 01 2018

Journal Name

Ieee Transactions On Network Science And Engineering

A Resource Allocation Mechanism for Cloud Radio Access Network Based on Cell Differentiation and Integration Concept

Zainab H.

M.

Firas

H. S.

...Show More Authors

View Publication

(16)

Publication Date

Thu Dec 24 2020

Journal Name

Psychology And Education

A Proposed Programme Based On Sensory Integration Theory For Remediating Some Development Learning Disabilities Among Children

1. Training Programme. 2. Sensory Integration. 3. Developmental Learning Disabilities

احمد

Abd Elrahman

...Show More Authors

The current research aims to prepare a proposed Programmebased sensory integration theory for remediating some developmental learning disabilities among children, researchers prepared a Programme based on sensory integration through reviewing studies related to the research topic that can be practicedby some active teaching strategies (cooperative learning, peer learning, Role-playing, and educational stories). The Finalformat consists of(39) training sessions.

Preview PDF

Publication Date

Thu Jun 20 2019

Journal Name

Baghdad Science Journal

An Optimised Method for Fetching and Transforming Survey Data based on SQL and R Programming Language

Data transformation

NoSQL

R programming

Structured query language.

Hasan

...Show More Authors

The development of information systems in recent years has contributed to various methods of gathering information to evaluate IS performance. The most common approach used to collect information is called the survey system. This method, however, suffers one major drawback. The decision makers consume considerable time to transform data from survey sheets to analytical programs. As such, this paper proposes a method called ‘survey algorithm based on R programming language’ or SABR, for data transformation from the survey sheets inside R environments by treating the arrangement of data as a relational format. R and Relational data format provide excellent opportunity to manage and analyse the accumulated data. Moreover, a survey syste

View Publication Preview PDF

(1)

Publication Date

Sat Jul 01 2023

Journal Name

Journal Of Accounting And Financial Studies ( Jafs )

A proposed audit program for a comprehensive electronic banking system based on business risks : applied research

Comprehensive Electronic Banking System

Business Risk Audit

Audit Procedures

Ali Mohaamed Nayyef Al-Ahmedi

Assistant. Prof. Dr. Ali Mohammed Thijeel AL-Mamouri

...Show More Authors

The research seeks to identify the comprehensive electronic banking system and the role of the auditor in light of the customer's application of electronic systems that depend on the Internet in providing its services, as a proposed audit program has been prepared in accordance with international auditing controls and standards based on the study of the customer's environment and the analysis of external and internal risks in the light of financial and non-financial indicators, the research reached a set of conclusions, most notably, increasing the dependence of banks on the comprehensive banking system for its ability to provide new and diverse banking services, The researcher suggested several recommendations, the most important of whi

View Publication Preview PDF

Publication Date

Sun Aug 01 2021

Journal Name

Journal Of Engineering

The Intelligent Auto-Tuning Controller Design Based on Dolphin Echo Location for Blood Glucose Monitoring System

Auto-Tuning

PID Controller

Dolphin Optimization Algorithm

Blood Glucose

Rabab Alaa

...Show More Authors

This paper presents an enhancement technique for tracking and regulating the blood glucose level for diabetic patients using an intelligent auto-tuning Proportional-Integral-Derivative PID controller. The proposed controller aims to generate the best insulin control action responsible for regulating the blood glucose level precisely, accurately, and quickly. The tuning control algorithm used the Dolphin Echolocation Optimization (DEO) algorithm for obtaining the near-optimal PID controller parameters with a proposed time domain specification performance index. The MATLAB simulation results for three different patients showed that the effectiveness and the robustness of the proposed control algorithm in terms of fast gene

View Publication Preview PDF

Publication Date

Mon Feb 06 2023

Journal Name

Journal Of Kufa-physics

No-Core optical fibers sensor for detecting hemoglobin concentration (HB) based on the Surface Plasmon resonance.

Mustafa

Soudad

...Show More Authors

In this work, a fiber-optic biomedical sensor was manufactured to detect hemoglobin percentages in the blood. SPR-based coreless optical fibers were developed and implemented using single and multiple optical fibers. It was also used to calculate refractive indices and concentrations of hemoglobin in blood samples. An optical fiber, with a thickness of 40 nanometers, was deposited on gold metal for the sensing area to increase the sensitivity of the sensor. The optical fiber used in this work has a diameter of 125μm, no core, and is made up of a pure silica glass rod and an acrylate coating. The length of the fiber was 4cm removed buffer and the splicing process was done. It is found in practice that when the sensitive refractive i

View Publication Preview PDF

(1)

Publication Date

Mon Jan 01 2024

Journal Name

Lecture Notes In Electrical Engineering

A Method Combining Compressive Sensing-Based Method of Moment and LU Decomposition for Solving Monostatic RCS

Yalan

Muhammad Firdaus

Jagadheswaran

Ghassan Nihad

...Show More Authors

View Publication

Publication Date

Wed Dec 15 2021

Journal Name

Nasaq

A Corpus-Based Approach to the Study of Vocabulary in English Textbooks for Iraqi Intermediate Schools

Dhea

...Show More Authors

Learning the vocabulary of a language has great impact on acquiring that language. Many scholars in the field of language learning emphasize the importance of vocabulary as part of the learner's communicative competence, considering it the heart of language. One of the best methods of learning vocabulary is to focus on those words of high frequency. The present article is a corpus based approach to the study of vocabulary whereby the research data are analyzed quantitatively using the software program "AntWordprofiler". This program analyses new input research data in terms of already stored reliable corpora. The aim of this article is to find out whether the vocabularies used in the English textbook for Intermediate Schools in Iraq are con

View Publication Preview PDF

Publication Date

Thu Nov 01 2018

Journal Name

Computers & Fluids

Assessing moment-based boundary conditions for the lattice Boltzmann equation: A study of dipole-wall collisions

Mohammed S.

D. I.

T.

...Show More Authors

View Publication

(20)

(18)

1 2 ... 40 41 42 43 ... 682 683