Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Fri Jul 14 2023

Journal Name

International Journal Of Information Technology & Decision Making

A Decision Modeling Approach for Data Acquisition Systems of the Vehicle Industry Based on Interval-Valued Linear Diophantine Fuzzy Set

Data acquisition system FDOSM FWZI Intelligent transportation system interval-valued linear Diophantine fuzzy set multicriteria decision making

Iraq

H. A.

Sarah

A. A.

Mohd Azri Mohd

Iraq T

...Show More Authors

Modeling data acquisition systems (DASs) can support the vehicle industry in the development and design of sophisticated driver assistance systems. Modeling DASs on the basis of multiple criteria is considered as a multicriteria decision-making (MCDM) problem. Although literature reviews have provided models for DASs, the issue of imprecise, unclear, and ambiguous information remains unresolved. Compared with existing MCDM methods, the robustness of the fuzzy decision by opinion score method II (FDOSM II) and fuzzy weighted with zero inconsistency II (FWZIC II) is demonstrated for modeling the DASs. However, these methods are implemented in an intuitionistic fuzzy set environment that restricts the ability of experts to provide mem

View Publication

(3)

(9)

Publication Date

Mon Feb 01 2021

Journal Name

Journal Of Physics: Conference Series

Graphical password based mouse behavior technique

Zahraa A.

Omar Z.

Firas A.

A.

Phaklen

...Show More Authors

Abstract<p>This paper proposes a new password generation technique on the basis of mouse motion and a special case location recognized by the number of clicks to protect sensitive data for different companies. Two, three special locations click points for the users has been proposed to increase password complexity. Unlike other currently available random password generators, the path and number of clicks will be added by admin, and authorized users have to be training on it.</p><p>This method aims to increase combinations for the graphical password generation using mouse motion for a limited number of users. A mathematical model is developed to calculate the performance</p> ... Show More

View Publication

(12)

(5)

Publication Date

Sun Oct 01 2023

Journal Name

Baghdad Science Journal

Watermark Based on Singular Value Decomposition

Cover

Norm

Ownership

Singular value decomposition

Watermark

Ali Abdulazeez Mohammed Baqer

Neamah Enad

...Show More Authors

Watermarking operation can be defined as a process of embedding special wanted and reversible information in important secure files to protect the ownership or information of the wanted cover file based on the proposed singular value decomposition (SVD) watermark. The proposed method for digital watermark has very huge domain for constructing final number and this mean protecting watermark from conflict. The cover file is the important image need to be protected. A hidden watermark is a unique number extracted from the cover file by performing proposed related and successive operations, starting by dividing the original image into four various parts with unequal size. Each part of these four treated as a separate matrix and applying SVD

View Publication Preview PDF

(2)

(1)

Publication Date

Sat Dec 18 2021

Journal Name

Egyptian Journal Of Chemistry

Coumarin based-histone deactylace HADC inhibitors

coumarin

Histone deacetylase

anti-proliferative activity

Histone deacetylase inhibitors.

sarah

mohammed

...Show More Authors

Coumarins have been recognized as anticancer competitors. HDACis are one of the interesting issues in the field of antitumor research. In order to achieve an increased anticancer efficacy, a series of hybrid compounds bearing coumarin scaffolds have been designed and synthesized as novel HDACis, In this review we present a series of novel HDAC inhibitors comprising coumarin as a core e of cap group of HDAC inhibitors that have been designed, synthesized and assessed for their enzyme inhibitory activity as well as antiproliferative activity. Most of them exhibited potent HDAC inhibitory activity and significant cytotoxicity

View Publication

(3)

Publication Date

Fri Jun 29 2018

Journal Name

Journal Of The College Of Education For Women

Audio Classification Based on Content Features

Multimedia

Audio classification

Feature extraction

Short time energy

Local Roughness features

First Order Gradient Feature.

اياد عبدالقهار عبدالسلام

...Show More Authors

Audio classification is the process to classify different audio types according to contents. It is implemented in a large variety of real world problems, all classification applications allowed the target subjects to be viewed as a specific type of audio and hence, there is a variety in the audio types and every type has to be treatedcarefully according to its significant properties.Feature extraction is an important process for audio classification. This workintroduces several sets of features according to the type, two types of audio (datasets) were studied. Two different features sets are proposed: (i) firstorder gradient feature vector, and (ii) Local roughness feature vector, the experimentsshowed that the results are competitive to

View Publication Preview PDF

Publication Date

Fri Mar 24 2017

Journal Name

Journal Of Engineering

Composite Techniques Based Color Image Compression

image compression

color images

composite techniques

composite transforms

compression parameters.

Zainab

...Show More Authors

Compression for color image is now necessary for transmission and storage in the data bases since the color gives a pleasing nature and natural for any object, so three composite techniques based color image compression is implemented to achieve image with high compression, no loss in original image, better performance and good image quality. These techniques are composite stationary wavelet technique (S), composite wavelet technique (W) and composite multi-wavelet technique (M). For the high energy sub-band of the 3 rd level of each composite transform in each composite technique, the compression parameters are calculated. The best composite transform among the 27 types is the three levels of multi-wavelet transform (MMM) in M technique wh

Publication Date

Tue Feb 28 2017

Journal Name

Journal Of Engineering

Composite Techniques Based Color Image Compression

image compression

color images

composite techniques

composite transforms

compression parameters.

Zainab Ibrahim

...Show More Authors

Compression for color image is now necessary for transmission and storage in the data bases since the color gives a pleasing nature and natural for any object, so three composite techniques based color image compression is implemented to achieve image with high compression, no loss in original image, better performance and good image quality. These techniques are composite stationary wavelet technique (S), composite wavelet technique (W) and composite multi-wavelet technique (M). For the high energy sub-band of the 3rd level of each composite transform in each composite technique, the compression parameters are calculated. The best composite transform among the 27 types is the three levels of multi-wavelet

View Publication Preview PDF

Publication Date

Sun Dec 01 2002

Journal Name

Iraqi Journal Of Physics

New DCT-Based Image Hiding Technique

DCT

image

S. M.

...Show More Authors

A new technique for embedding image data into another BMP image data is presented. The image data to be embedded is referred to as signature image, while the image into which the signature image is embedded is referred as host image. The host and the signature images are first partitioned into 8x8 blocks, discrete cosine transformed “DCT”, only significant coefficients are retained, the retained coefficients then inserted in the transformed block in a forward and backward zigzag scan direction. The result then inversely transformed and presented as a BMP image file. The peak signal-to-noise ratio (PSNR) is exploited to evaluate the objective visual quality of the host image compared with the original image.

View Publication Preview PDF

Publication Date

Mon Apr 15 2019

Journal Name

Proceedings Of The International Conference On Information And Communication Technology

A steganography based on orthogonal moments

Hayder S.

Basheera M.

Sadiq H.

Dhyia

...Show More Authors

View Publication

(28)

(24)

Publication Date

Sun Dec 30 2018

Journal Name

Journal Of Engineering

Knowledge-Based Urban Development The Impact of Knowledge- Based Urban Development in the Growth of Contemporary Cities

knowledge-based urban development

knowledge

knowledge workers

knowledge-based economy

Knowledge City.

Safaa Aldeen H.

Shatha Saleem

...Show More Authors

Urban Development refers to many topics such as: increased population density, city size, and individual’s production, distribution of technology and the growth of commercial, industrial and service professions. Such development is linked to the coordination of social and cultural trends in order to achieve social progress and economical prosperity. Knowledge as a topic now is known as intellectual capital wich led to upgrae the concept of urban development to be extended into many fields of knowledge, for example, cultural, social and human development to move the level of community culture into a new better standard.

The research adopted the urban transformation based on knowledge as an important factor in gr

View Publication Preview PDF

(1)

1 2 ... 53 54 55 56 ... 696 697