Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Tue Jun 23 2020

Journal Name

Baghdad Science Journal

Content Based Image Retrieval (CBIR) by Statistical Methods

Content Based Image Retrieval

Histogram statistical characteristics

Test of- T

Trademark Image Retrieval

Fathala

...Show More Authors

An image retrieval system is a computer system for browsing, looking and recovering pictures from a huge database of advanced pictures. The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image. The researchers were developing a new mechanism to retrieval systems which is mainly based on two procedures. The first procedure relies on extract the statistical feature of both original, traditional image by using the histogram and statistical characteristics (mean, standard deviation). The second procedure relies on the T-

View Publication Preview PDF

(13)

(8)

Publication Date

Sat Oct 19 2024

Journal Name

Iraqi Statisticians Journal

Forecasting Gold prices by hybrid ANFIS-based algorithm

Munaf Yousif

Ahmed A.

...Show More Authors

In this article, the high accuracy and effectiveness of forecasting global gold prices are verified using a hybrid machine learning algorithm incorporating an Adaptive Neuro-Fuzzy Inference System (ANFIS) model with Particle Swarm Optimization (PSO) and Gray Wolf Optimizer (GWO). The hybrid approach had successes that enabled it to be a good strategy for practical use. The ARIMA-ANFIS hybrid methodology was used to forecast global gold prices. The ARIMA model is implemented on real data, and then its nonlinear residuals are predicted by ANFIS, ANFIS-PSO, and ANFIS-GWO. The results indicate that hybrid models improve the accuracy of single ARIMA and ANFIS models in forecasting. Finally, a comparison was made between the hybrid foreca

View Publication

Publication Date

Mon Jan 01 2024

Journal Name

Journal Of Engineering

Face-based Gender Classification Using Deep Learning Model

Alex-Net

CLAHE

Deep learning

Gender Classification

Buraq Abed Ruda

Faten Abed Ali

...Show More Authors

Gender classification is a critical task in computer vision. This task holds substantial importance in various domains, including surveillance, marketing, and human-computer interaction. In this work, the face gender classification model proposed consists of three main phases: the first phase involves applying the Viola-Jones algorithm to detect facial images, which includes four steps: 1) Haar-like features, 2) Integral Image, 3) Adaboost Learning, and 4) Cascade Classifier. In the second phase, four pre-processing operations are employed, namely cropping, resizing, converting the image from(RGB) Color Space to (LAB) color space, and enhancing the images using (HE, CLAHE). The final phase involves utilizing Transfer lea

View Publication Preview PDF

(2)

Publication Date

Sun Sep 24 2023

Journal Name

Journal Of Al-qadisiyah For Computer Science And Mathematics

Iris Data Compression Based on Hexa-Data Coding

Ghadah

Haider Hameed

Mohammed M.

Marcos. A.

...Show More Authors

Iris research is focused on developing techniques for identifying and locating relevant biometric features, accurate segmentation and efficient computation while lending themselves to compression methods. Most iris segmentation methods are based on complex modelling of traits and characteristics which, in turn, reduce the effectiveness of the system being used as a real time system. This paper introduces a novel parameterized technique for iris segmentation. The method is based on a number of steps starting from converting grayscale eye image to a bit plane representation, selection of the most significant bit planes followed by a parameterization of the iris location resulting in an accurate segmentation of the iris from the origin

View Publication

Publication Date

Sat Jun 26 2021

Journal Name

2021 Ieee International Conference On Automatic Control & Intelligent Systems (i2cacis)

Vulnerability Assessment on Ethereum Based Smart Contract Applications

Nurul Aida

Md Gapar Md

Mohammed Hazim

Asif

Mohammed S. H.

...Show More Authors

View Publication

(10)

(5)

Publication Date

Wed Jun 01 2011

Journal Name

Journal Of Al-nahrain University Science

Breaking Knapsack Cipher Using Population Based Incremental Learning

Nasreen J.

...Show More Authors

View Publication

Publication Date

Sun Nov 01 2020

Journal Name

Journal Of Physics: Conference Series

Improve topic modeling algorithms based on Twitter hashtags

Hayder M.

...Show More Authors

Abstract<p>Today with increase using social media, a lot of researchers have interested in topic extraction from Twitter. Twitter is an unstructured short text and messy that it is critical to find topics from tweets. While topic modeling algorithms such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are originally designed to derive topics from large documents such as articles, and books. They are often less efficient when applied to short text content like Twitter. Luckily, Twitter has many features that represent the interaction between users. Tweets have rich user-generated hashtags as keywords. In this paper, we exploit the hashtags feature to improve topics learned</p> ... Show More

View Publication

(20)

(19)

Publication Date

Sun Sep 01 2013

Journal Name

International Journal Of Computer Applications

Concise Architecture of a Remote Network based Controller

Acquisition

Monitoring

Scalability

Reusability

Economical

microcontroller.

Hayder

Sadiq H.

Basheera M.

...Show More Authors

The development of microcontroller is used in monitoring and data acquisition recently. This development has born various architectures for spreading and interfacing the microcontroller in network environment. Some of existing architecture suffers from redundant in resources, extra processing, high cost and delay in response. This paper presents flexible concise architecture for building distributed microcontroller networked system. The system consists of only one server, works through the internet, and a set of microcontrollers distributed in different sites. Each microcontroller is connected through the Ethernet to the internet. In this system the client requesting data from certain side is accomplished through just one server that is in

View Publication Preview PDF

Publication Date

Sun Sep 11 2022

Journal Name

Electronics

IoT-Based Motorbike Ambulance: Secure and Efficient Transportation

Halah Hasan

Abed Saif

Marwan Kadhim Mohammed

Gehad Abdullah

Khaled H.

Mohammed A. A.

...Show More Authors

The predilection for 5G telemedicine networks has piqued the interest of industry researchers and academics. The most significant barrier to global telemedicine adoption is to achieve a secure and efficient transport of patients, which has two critical responsibilities. The first is to get the patient to the nearest hospital as quickly as possible, and the second is to keep the connection secure while traveling to the hospital. As a result, a new network scheme has been suggested to expand the medical delivery system, which is an agile network scheme to securely redirect ambulance motorbikes to the nearest hospital in emergency cases. This research provides a secured and efficient telemedicine transport strategy compatible with the

View Publication

(4)

Publication Date

Tue Feb 28 2023

Journal Name

International Journal Of Intelligent Engineering And Systems

Design and Implementation of EEG-Based Smart Structure

Oger Zaya

Yarub

...Show More Authors

View Publication

(7)

(2)

1 2 ... 65 66 67 68 ... 721 722