Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Thu Dec 14 2023

Journal Name

Iete Journal Of Research

Performance Enhancement of VLC-NOMA Employing Beamforming Function based vehicle-to-multivehicle Communication system

Lwaa Faisal

...Show More Authors

View Publication

(8)

(6)

Publication Date

Tue Oct 01 2024

Journal Name

Separation And Purification Technology

A comprehensive review on the use of Ti3C2Tx MXene in membrane-based water treatment

2D materials

Membrane-based processes

MXenes

Ti3C2Tx

Water treatment

Albayati N.

Zainab A.

Hind Abdul

Mohammed

Peter

...Show More Authors

View Publication

(31)

(32)

Publication Date

Tue Sep 30 2025

Journal Name

Iraqi Journal Of Chemical And Petroleum Engineering

Geomechanical properties evaluation of Mauddud formation based on experimental measurements and well log data

Mohammed Almojahed F.

Nagham J.

...Show More Authors

Mauddud formation is one of the most prominent formations in Northeastern Iraq due to its significant hydrocarbon reserves, making accurate geomechanical characterization essential for safe drilling operations and informed development planning. This study constructs a calibrated post-drill one dimensional mechanical earth model (1D-MEM) for selected wells, levering Techlog software to integrate rock mechanical data, image logs, multi-arm caliper measurements, conventional well logs, drilling reports, and core analyses. The methodology provides a detailed workflow for estimating geomechanical properties from log and image analysis to model calibration. Validation of the 1-D MEM performed through cross-comparison with direct me

View Publication

Publication Date

Tue Dec 01 2020

Journal Name

Results In Physics

Alpha clustering preformation probability in even-even and odd-A<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" id="d1e3355" altimg="si39.svg"><mml:msup><mml:mrow /><mml:mrow><mml:mn>270</mml:mn><mml:mo>−</mml:mo><mml:mn>317</mml:mn></mml:mrow></mml:msup></mml:math>(116 and 117) using cluster formation model and the mass formulae : KTUY05 and WS4

Norah A.M.

Saad M.

H.

...Show More Authors

View Publication

(1)

(2)

Publication Date

Fri Mar 29 2024

Journal Name

Iraqi Journal Of Science

Biological versus Topological Domains in Improving the Reliability of Evolutionary-Based Protein Complex Detection Algorithms

Isra H.

Bara'a Ali

Dhia A.

...Show More Authors

By definition, the detection of protein complexes that form protein-protein interaction networks (PPINs) is an NP-hard problem. Evolutionary algorithms (EAs), as global search methods, are proven in the literature to be more successful than greedy methods in detecting protein complexes. However, the design of most of these EA-based approaches relies on the topological information of the proteins in the PPIN. Biological information, as a key resource for molecular profiles, on the other hand, acquired a little interest in the design of the components in these EA-based methods. The main aim of this paper is to redesign two operators in the EA based on the functional domain rather than the graph topological domain. The perturb

Publication Date

Tue Dec 05 2017

Journal Name

International Journal Of Science And Research (ijsr)

Multi Response Optimization of Submerged Arc Welding Using Taguchi Fuzzy Logic Based on Utility Theory

Ali

...Show More Authors

Preview PDF

(1)

Publication Date

Sat Jan 01 2022

Journal Name

Indonesian Journal Of Electrical Engineering And Computer Science (ijeecs)

Increasing validation accuracy of a face mask detection by new deep learning model-based classification

Mohanad

Muna

Dheyaa

...Show More Authors

During COVID-19, wearing a mask was globally mandated in various workplaces, departments, and offices. New deep learning convolutional neural network (CNN) based classifications were proposed to increase the validation accuracy of face mask detection. This work introduces a face mask model that is able to recognize whether a person is wearing mask or not. The proposed model has two stages to detect and recognize the face mask; at the first stage, the Haar cascade detector is used to detect the face, while at the second stage, the proposed CNN model is used as a classification model that is built from scratch. The experiment was applied on masked faces (MAFA) dataset with images of 160x160 pixels size and RGB color. The model achieve

(4)

Publication Date

Tue Jun 20 2023

Journal Name

Baghdad Science Journal

Visible-Light-driven Photocatalytic Properties of Copper(I) Oxide (Cu2O) and Its Graphene-based Nanocomposites

Cu2O

Graphene-based nanocomposites

Nanocrystals

Photocatalytic evaluation

Visible-Light

Abdul Waheed

Gul

Elyor

Basant

Aigul Baimagambetova

Ahmad Hosseini

...Show More Authors

In this study, an improved process was proposed for the synthesis of structure-controlled Cu2O nanoparticles, using a simplified wet chemical method at room temperature. A chemical solution route was established to synthesize Cu2O crystals with various sizes and morphologies. The structure, morphology, and optical properties of Cu2O nanoparticles were analyzed by X-ray diffraction, SEM (scanning electron microscope), and UV-Vis spectroscopy. By adjusting the aqueous mixture solutions of NaOH and NH2OH•HCl, the synthesis of Cu2O crystals with different morphology and size could be realized. Strangely, it was found that the change in the ratio of de-ionized water and NaOH aqueous solution led to the synthesis of Cu2O crystals of differen

View Publication Preview PDF

(15)

(17)

Publication Date

Fri Sep 01 2023

Journal Name

Journal Of Engineering

Impact of Sulfate in the Sand on the Compressive Strength of Metakaolin-Based Geopolymer Mortar

Flexible pavement

Interface bond strength

Destructive tests

Non-Destructive tests

Sara Y.

Layth

...Show More Authors

The advancement of cement alternatives in the construction materials industry is fundamental to sustainable development. Geopolymer is the optimal substitute for ordinary Portland cement, which produces 80% less CO₂ emissions than ordinary Portland cement. Metakaolin was used as one of the raw materials in the geopolymerization process. This research examines the influence of three different percentages of sulfate (0.00038, 1.532, and 16.24) % in sand per molarity of NaOH on the compressive strength of metakaolin-based geopolymer mortar (MK-GPM). Samples were prepared with two different molarities (8M and 12M) and cured at room temperature. The best compressive strength value (56.98MPa) was recorded with 12M w

View Publication Preview PDF

(1)

Publication Date

Tue Jan 01 2019

Journal Name

Journal Of Southwest Jiaotong University

Multi -Focus Image Fusion Based on Stationary Wavelet Transform and PCA on YCBCR Color Space

Alaa A.

Firas A.

Amna

...Show More Authors

The multi-focus image fusion method can fuse more than one focused image to generate a single image with more accurate description. The purpose of image fusion is to generate one image by combining information from many source images of the same scene. In this paper, a multi-focus image fusion method is proposed with a hybrid pixel level obtained in the spatial and transform domains. The proposed method is implemented on multi-focus source images in YCbCr color space. As the first step two-level stationary wavelet transform was applied on the Y channel of two source images. The fused Y channel is implemented by using many fusion rule techniques. The Cb and Cr channels of the source images are fused using principal component analysis (PCA).

View Publication Preview PDF

(1)

(3)

1 2 ... 109 110 111 112 ... 721 722