Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Sun Jun 30 2024

Journal Name

Iraqi Journal Of Science

Gray-Scale Image Compression Method Based on a Pixel-Based Adaptive Technique

Zahraa.H.

Ghadah K.

...Show More Authors

Today in the digital realm, where images constitute the massive resource of the social media base but unfortunately suffer from two issues of size and transmission, compression is the ideal solution. Pixel base techniques are one of the modern spatially optimized modeling techniques of deterministic and probabilistic bases that imply mean, index, and residual. This paper introduces adaptive pixel-based coding techniques for the probabilistic part of a lossy scheme by incorporating the MMSA of the C321 base along with the utilization of the deterministic part losslessly. The tested results achieved higher size reduction performance compared to the traditional pixel-based techniques and the standard JPEG by about 40% and 50%,

View Publication

(2)

Publication Date

Wed Mar 29 2023

Journal Name

Journal Of Robotics

Real-Time SLAM Mobile Robot and Navigation Based on Cloud-Based Implementation

Jaafar

Dheyaa

...Show More Authors

This study investigates the feasibility of a mobile robot navigating and discovering its location in unknown environments, followed by the creation of maps of these navigated environments for future use. First, a real mobile robot named TurtleBot3 Burger was used to achieve the simultaneous localization and mapping (SLAM) technique for a complex environment with 12 obstacles of different sizes based on the Rviz library, which is built on the robot operating system (ROS) booted in Linux. It is possible to control the robot and perform this process remotely by using an Amazon Elastic Compute Cloud (Amazon EC2) instance service. Then, the map to the Amazon Simple Storage Service (Amazon S3) cloud was uploaded. This provides a database

View Publication

(18)

(12)

Publication Date

Tue Dec 07 2021

Journal Name

2021 14th International Conference On Developments In Esystems Engineering (dese)

Content Based Image Retrieval Based on Feature Fusion and Support Vector Machine

Ibtihaal M.

Sadiq H.

Basheera M.

Abir

...Show More Authors

View Publication

(10)

(8)

Publication Date

Sun Apr 01 2018

Journal Name

Journal Of Engineering And Applied Sciences

New Data Security Method Based on Biometrics

Cryptosystem

ciphering

fingerprint minutiae and random text

represent original

discovered

extracting proper

Sally

...Show More Authors

Merging biometrics with cryptography has become more familiar and a great scientific field was born for researchers. Biometrics adds distinctive property to the security systems, due biometrics is unique and individual features for every person. In this study, a new method is presented for ciphering data based on fingerprint features. This research is done by addressing plaintext message based on positions of extracted minutiae from fingerprint into a generated random text file regardless the size of data. The proposed method can be explained in three scenarios. In the first scenario the message was used inside random text directly at positions of minutiae in the second scenario the message was encrypted with a choosen word before ciphering

Publication Date

Sun Nov 01 2020

Journal Name

Iop Conference Series: Materials Science And Engineering

SDN-RA: An Optimized Reschedule Algorithm of SDN Load Balancer for Data Center Networks Based on QoS

Kadim U.N.

imad j. mohammed

...Show More Authors

Abstract<p>With the development of cloud computing during the latest years, data center networks have become a great topic in both industrial and academic societies. Nevertheless, traditional methods based on manual and hardware devices are burdensome, expensive, and cannot completely utilize the ability of physical network infrastructure. Thus, Software-Defined Networking (SDN) has been hyped as one of the best encouraging solutions for future Internet performance. SDN notable by two features; the separation of control plane from the data plane, and providing the network development by programmable capabilities instead of hardware solutions. Current paper introduces an SDN-based optimized Resch</p> ... Show More

View Publication

(3)

Publication Date

Tue Jul 01 2014

Journal Name

Ieee Transactions On Circuits And Systems I: Regular Papers

Crosstalk-Aware Multiple Error Detection Scheme Based on Two-Dimensional Parities for Energy Efficient Network on Chip

Wameedh N.

K.

S. J.

Fakhrul Z.

Yehea I.

...Show More Authors

Achieving reliable operation under the influence of deep-submicrometer noise sources including crosstalk noise at low voltage operation is a major challenge for network on chip links. In this paper, we propose a coding scheme that simultaneously addresses crosstalk effects on signal delay and detects up to seven random errors through wire duplication and simple parity checks calculated over the rows and columns of the two-dimensional data. This high error detection capability enables the reduction of operating voltage on the wire leading to energy saving. The results show that the proposed scheme reduces the energy consumption up to 53% as compared to other schemes at iso-reliability performance despite the increase in the overhead number o

View Publication

(25)

(19)

Publication Date

Mon Jan 01 2024

Journal Name

2nd International Conference For Engineering Sciences And Information Technology (esit 2022): Esit2022 Conference Proceedings

Room temperature flexible sensor based on F-MWCNT modified by polypyrrole conductive polymer for NO2 gas detection

Gas sensor

flexible network

f-MWCNTs

polypyrrole

filtration from suspension.

Aqeel Y.

Wasan

...Show More Authors

This project sought to fabricate a flexible gas sensor based on a short functionalized multi-walled carbon nanotubes (f-MWCNTs) network for nitrogen dioxide gas detection. The network was prepared by filtration from the suspension (FFS) method and modified by coating with a layer of polypyrrole conductive polymer (PPy) prepared by the oxidative chemical polymerization to improve the properties of the network. The structural, optical, and morphological properties of the f-MWCNTs and f-MWCNTs/PPy network were studied using X-ray diffraction (XRD), Fourie-transform infrared (FTIR), with an AFM (atomic force microscopy). XRD proved that the structure of f-MWCNTs is unaffected by the synthesis procedure. The FTIR spectra verified the existence o

View Publication

(1)

Publication Date

Tue Feb 12 2019

Journal Name

Iraqi Journal Of Laser

Generation of True Random TTL Signals for Quantum Key-Distribution Systems Based on True Random Binary Sequences

Salwa M.

Shelan K.

Ahmed I.

...Show More Authors

A true random TTL pulse generator was implemented and investigated for quantum key distribution systems. The random TTL signals are generated by low cost components available in the local markets. The TTL signals are obtained by using true random binary sequences based on registering photon arrival time difference registered in coincidence windows between two single – photon detectors. The true random TTL pulse generator performance was tested by using time to digital converters which gives accurate readings for photon arrival time. The proposed true random pulse TTL generator can be used in any quantum -key distribution system for random operation of the transmitters for these systems

View Publication Preview PDF

Publication Date

Thu Feb 07 2019

Journal Name

Iraqi Journal Of Laser

Tapered Splicing Points SMF-PCF-SMF Structure based on Mach-Zehnder interferometer for Enhanced Refractive Index Sensing

Nawras Ali.

Hanan J.

Saif A.

...Show More Authors

Photonic crystal fiber interferometers (PCFIs) are widely used for sensing applications. This work presented solid core-PCFs based on Mach-Zehnder modal interferometer for sensing refractive index. The general structure of sensor was applied by splicing short lengths of PCF in both sides with conventional single mode fiber (SMF-28).To apply modal interferometer theory collapsing technique based on fusion splicing used to excite higher order modes (LP01 and LP11). A high sensitive optical spectrum analyzer (OSA) was used to monitor and record the transmitted wavelength. This work studied a Mach-Zahnder interferometer refractive index sensor based on splicing point tapered SMF-PCF-SMF. Relation between refractive index sensitivity and tape

View Publication Preview PDF

Publication Date

Sat May 01 2021

Journal Name

Journal Of Physics: Conference Series

An Efficient Shrinkage Estimators For Generalized Inverse Rayleigh Distribution Based On Bounded And Series Stress-Strength Models

Iman Ghaji

Bayda Atiya

Abbas N.

...Show More Authors

Abstract<p>In this paper, we investigate two stress-strength models (Bounded and Series) in systems reliability based on Generalized Inverse Rayleigh distribution. To obtain some estimates of shrinkage estimators, Bayesian methods under informative and non-informative assumptions are used. For comparison of the presented methods, Monte Carlo simulations based on the Mean squared Error criteria are applied.</p>

View Publication

(4)

(2)

1 2 ... 43 44 45 46 ... 696 697