Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Thu Nov 17 2022

Journal Name

Journal Of Information And Optimization Sciences

Hybrid deep learning model for Arabic text classification based on mutual information

Farah A.

Nada A. Z.

...Show More Authors

View Publication

(6)

Publication Date

Thu Jan 20 2022

Journal Name

Webology

Hybrid Intrusion Detection System based on DNA Encoding, Teiresias Algorithm and Clustering Method

Intrusion Detection System

DNA Encoding

Clustering Algorithm

UNSW-NB15 Database.

Omar Fitian

Mazin S.

...Show More Authors

Until recently, researchers have utilized and applied various techniques for intrusion detection system (IDS), including DNA encoding and clustering that are widely used for this purpose. In addition to the other two major techniques for detection are anomaly and misuse detection, where anomaly detection is done based on user behavior, while misuse detection is done based on known attacks signatures. However, both techniques have some drawbacks, such as a high false alarm rate. Therefore, hybrid IDS takes advantage of combining the strength of both techniques to overcome their limitations. In this paper, a hybrid IDS is proposed based on the DNA encoding and clustering method. The proposed DNA encoding is done based on the UNSW-NB15

View Publication

(3)

Publication Date

Mon Dec 14 2020

Journal Name

2020 13th International Conference On Developments In Esystems Engineering (dese)

Anomaly Based Intrusion Detection System Using Hierarchical Classification and Clustering Techniques

H.

Suhaila N.

...Show More Authors

With the rapid development of computers and network technologies, the security of information in the internet becomes compromise and many threats may affect the integrity of such information. Many researches are focused theirs works on providing solution to this threat. Machine learning and data mining are widely used in anomaly-detection schemes to decide whether or not a malicious activity is taking place on a network. In this paper a hierarchical classification for anomaly based intrusion detection system is proposed. Two levels of features selection and classification are used. In the first level, the global feature vector for detection the basic attacks (DoS, U2R, R2L and Probe) is selected. In the second level, four local feature vect

View Publication

(5)

Publication Date

Wed Apr 01 2015

Journal Name

2015 Annual Ieee Systems Conference (syscon) Proceedings

Automatic generation of fuzzy classification rules using granulation-based adaptive clustering

M.

...Show More Authors

View Publication

(4)

Publication Date

Wed Apr 10 2019

Journal Name

Engineering, Technology & Applied Science Research

Content Based Image Clustering Technique Using Statistical Features and Genetic Algorithm

Alsaidi B.K.

...Show More Authors

Text based-image clustering (TBIC) is an insufficient approach for clustering related web images. It is a challenging task to abstract the visual features of images with the support of textual information in a database. In content-based image clustering (CBIC), image data are clustered on the foundation of specific features like texture, colors, boundaries, shapes. In this paper, an effective CBIC) technique is presented, which uses texture and statistical features of the images. The statistical features or moments of colors (mean, skewness, standard deviation, kurtosis, and variance) are extracted from the images. These features are collected in a one dimension array, and then genetic algorithm (GA) is applied for image clustering.

View Publication

(9)

(5)

Publication Date

Sun Mar 01 2026

Journal Name

Journal Of Information Hiding And Multimedia Signal Processing

Designing a New Text Encryption Approach Based on Genetic Algorithm

Genetic Algorithms

Text Encryption

Encryption Algorithm

Cryptography

Riyadh Bassil

Ahmed O.

Hayder S.

...Show More Authors

Today, data security is a major problem concerning organizations and indi- viduals. The confidentiality of information is associated with using reliable and robust encryption algorithms in systems. Cyber-attacks on data and systems have become prevalent and sophisticated, and are increasing rapidly; hence, the need for developing robust encryption algorithms is crucial nowadays. This paper proposes a new encryption algorithm using dynamic symmetric key cryptography to encrypt text files. It utilizes a secret key for encryption and decryption processes, where the key’s length is varied depending on the text size. This presented approach gives a trade-off between speed and security, making it suitable for various applications, such as secur

View Publication Preview PDF

Publication Date

Wed Jan 01 2014

Journal Name

Journal Of Engineering Research And Applications

Tenser Product of Representation for the Group Cn

Ahmed E. Abdul-Nabi

Suha Talib

Niran Sabah

...Show More Authors

Publication Date

Sat Mar 04 2023

Journal Name

Baghdad Science Journal

Exploration of CPCD number for power graph

Corona domination number

Cycle

Path

Pendent vertex

Perfect matching

Support vertex

S.

G.

C.

...Show More Authors

Recently, complementary perfect corona domination in graphs was introduced. A dominating set S of a graph G is said to be a complementary perfect corona dominating set (CPCD – set) if each vertex in is either a pendent vertex or a support vertex and has a perfect matching. The minimum cardinality of a complementary perfect corona dominating set is called the complementary perfect corona domination number and is denoted by . In this paper, our parameter hasbeen discussed for power graphs of path and cycle.

View Publication Preview PDF

(1)

Publication Date

Wed Dec 31 2025

Journal Name

Enquiry The Arcc Journal For Architectural Research

Graph-Theoretic Analysis for Sustainable Urban Structure

Tamara

Zainab

...Show More Authors

A significant challenge arises in the characterization of urban systems, especially regarding the intricate structures of Central Business Districts (CBDs). Conventional models seem insufficient, failing to comprehend the non-linear, network-oriented structure of the city's economic and social dynamics. This creates a disparity between the city's physical, geographical structure and the unseen processes occurring within it. The fundamental inquiry is thus configurational: how can we systematically examine the inherent spatial logic of the CBD to develop a more efficient and predictive planning model? This paper presents a theoretical and methodological model to explore this inquiry, which focuses on Lower Manhattan as the primary su

View Publication

Publication Date

Sun May 01 2016

Journal Name

Iraqi Journal Of Science

Efficient text in image hiding method based on LSB method principle

steganography

secret text

text in image hiding

password.

Wejdan A. Amer

...Show More Authors

The steganography (text in image hiding) methods still considered important issues to the researchers at the present time. The steganography methods were varied in its hiding styles from a simple to complex techniques that are resistant to potential attacks. In current research the attack on the host's secret text problem didn’t considered, but an improved text hiding within the image have highly confidential was proposed and implemented companied with a strong password method, so as to ensure no change will be made in the pixel values of the host image after text hiding. The phrase “highly confidential” denoted to the low suspicious it has been performed may be found in the covered image. The Experimental results show that the covere

View Publication

1 2 3 4 ... 724 725 726 727