Graph based text representation for document clustering

Asma Khazaal Abdulsahib Abdulsahib; SITI SAKIRA KAMARUDDIN KAMARUDDIN

Details

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Theoretical And Applied Information Technology

Volume

76

Issue Number

1

Choose Citation Style

Statistics

View publication

5

View pdf

3

Statistics

(15)

Graph based text representation for document clustering

Text Representation Schemes

Dependency Graph

Document Clustering

Sparsity Problem

Semantic Problem.

Asma Khazaal Abdulsahib Abdulsahib

SITI SAKIRA KAMARUDDIN KAMARUDDIN

...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Preview PDF

Quick Preview PDF

Publication Date

Thu Jun 06 2024

Journal Name

Journal Of Applied Engineering And Technological Science (jaets)

Deep Learning and Its Role in Diagnosing Heart Diseases Based on Electrocardiography (ECG)

Diagnosing

Heart

CNN

Signal

Qaswaa Khaled

...Show More Authors

Diagnosing heart disease has become a very important topic for researchers specializing in artificial intelligence, because intelligence is involved in most diseases, especially after the Corona pandemic, which forced the world to turn to intelligence. Therefore, the basic idea in this research was to shed light on the diagnosis of heart diseases by relying on deep learning of a pre-trained model (Efficient b3) under the premise of using the electrical signals of the electrocardiogram and resample the signal in order to introduce it to the neural network with only trimming processing operations because it is an electrical signal whose parameters cannot be changed. The data set (China Physiological Signal Challenge -cspsc2018) was ad

View Publication

(1)

Publication Date

Fri Mar 01 2019

Journal Name

Al-khwarizmi Engineering Journal

A Digital-Based Optimal AVR Design of Synchronous Generator Exciter Using LQR Technique

Ibraheem Kasim

...Show More Authors

In this paper a new structure for the AVR of the power system exciter is proposed and designed using digital-based LQR. With two weighting matrices R and Q, this method produces an optimal regulator that is used to generate the feedback control law. These matrices are called state and control weighting matrices and are used to balance between the relative importance of the input and the states in the cost function that is being optimized. A sample power system composed of single machine connected to an infinite- bus bar (SMIB) with both a conventional and a proposed Digital AVR (DAVR) is simulated. Evaluation results show that the DAVR damps well the oscillations of the terminal voltage and presents a faster respo

View Publication Preview PDF

Publication Date

Thu Jun 16 2022

Journal Name

Al-khwarizmi Engineering Journal

Path Planning and Obstacle Avoidance of a Mobile Robot based on GWO Algorithm

Tahseen Fadhil

Alaa Hassan

...Show More Authors

planning is among the most significant in the field of robotics research. As it is linked to finding a safe and efficient route in a cluttered environment for wheeled mobile robots and is considered a significant prerequisite for any such mobile robot project to be a success. This paper proposes the optimal path planning of the wheeled mobile robot with collision avoidance by using an algorithm called grey wolf optimization (GWO) as a method for finding the shortest and safe. The research goals in this study for identify the best path while taking into account the effect of the number of obstacles and design parameters on performance for the algorithm to find the best path. The simulations are run in the MATLAB environment to test the

View Publication Preview PDF

(6)

(4)

Publication Date

Thu Aug 01 2024

Journal Name

Iop Conference Series: Earth And Environmental Science

Smart Irrigation Technique in the Fixed Irrigation System Based on Soil Moisture Content

Dana

Ali

...Show More Authors

Abstract<p>The growing water demand has raised serious concerns about the future of irrigated agriculture in many parts all over the world, changing environmental conditions and shortage of water (especially in Iraq) have led to the need for a new system that efficiently manages the irrigation of crops. With the increasing population growing at a rapid pace, traditional agriculture will have a tough time meeting future food demands. Water availability and conservation are major concerns for farmers. The configuration of the smart irrigation system was designed based on data specific to the parameters concerning the characteristics of the plant and the properties of soil which are measured once i</p> ... Show More

View Publication

(3)

(2)

Publication Date

Wed Aug 28 2024

Journal Name

Mesopotamian Journal Of Cybersecurity

A Novel Anomaly Intrusion Detection Method based on RNA Encoding and ResNet50 Model

Mohammed

Omar Fitian

Safa Ahmed

Mohammed Khaleel

Saleh Mahdi

...Show More Authors

Cybersecurity refers to the actions that are used by people and companies to protect themselves and their information from cyber threats. Different security methods have been proposed for detecting network abnormal behavior, but some effective attacks are still a major concern in the computer community. Many security gaps, like Denial of Service, spam, phishing, and other types of attacks, are reported daily, and the attack numbers are growing. Intrusion detection is a security protection method that is used to detect and report any abnormal traffic automatically that may affect network security, such as internal attacks, external attacks, and maloperations. This paper proposed an anomaly intrusion detection system method based on a

View Publication

(9)

(4)

Publication Date

Tue Jun 01 2021

Journal Name

Baghdad Science Journal

Synthesis, Characterization and Gas Sensor Application of New Composite Based on MWCNTs:CoPc:Metal Oxide

MWCNTs

Cobalt phthalocynanine

nanocomposite

gas sensing

Mohanad Mousa

Burak Yahya

Emman J.

Abbas Jassim

...Show More Authors

The synthesis of new substituted cobalt Phthalocyanine (CoPc) was carried out using starting materials Naphthalene-1,4,5, tetracarbonic acid dianhydride (NDI) employing dry process method. Metal oxides (MO) alloy of (60%Ni₃O₄40%-Co₃O₄ ) have been functionalized with multiwall carbon nanotubes (F-MWCNTs) to produce (F-MWCNTs/MO) nanocomposite (E2) and mixed with CoPc to yield (F-MWCNT/CoPc/MO) (E3). These composites were investigated using different analytical and spectrophotometric methods such as ¹H-NMR (0-18 ppm), FTIR spectroscopy in the range of (400-4000cm-1), powder X-rays diffraction (PXRD, 2θ ^o = 10-80), Raman spectroscopy (0-4000 cm^-1), and UV-Visib

View Publication Preview PDF

(18)

(14)

Publication Date

Wed Mar 24 2021

Journal Name

Ieee Access

Smart IoT Network Based Convolutional Recurrent Neural Network With Element-Wise Prediction System

Nadia Adnan Shiltagh

Hamed S.

...Show More Authors

An Intelligent Internet of Things network based on an Artificial Intelligent System, can substantially control and reduce the congestion effects in the network. In this paper, an artificial intelligent system is proposed for eliminating the congestion effects in traffic load in an Intelligent Internet of Things network based on a deep learning Convolutional Recurrent Neural Network with a modified Element-wise Attention Gate. The invisible layer of the modified Element-wise Attention Gate structure has self-feedback to increase its long short-term memory. The artificial intelligent system is implemented for next step ahead traffic estimation and clustering the network. In the proposed architecture, each sensing node is adaptive and able to

(14)

(12)

Publication Date

Tue Feb 01 2022

Journal Name

Civil Engineering Journal

Calibration of a New Concrete Damage Plasticity Theoretical Model Based on Experimental Parameters

Alaa Hussein

Ali H.

Ali A.

Ammar N.

...Show More Authors

The introduction of concrete damage plasticity material models has significantly improved the accuracy with which the concrete structural elements can be predicted in terms of their structural response. Research into this method's accuracy in analyzing complex concrete forms has been limited. A damage model combined with a plasticity model, based on continuum damage mechanics, is recommended for effectively predicting and simulating concrete behaviour. The damage parameters, such as compressive and tensile damages, can be defined to simulate concrete behavior in a damaged-plasticity model accurately. This research aims to propose an analytical model for assessing concrete compressive damage based on stiffness deterioration. The prop

(35)

(32)

Publication Date

Sun Feb 25 2024

Journal Name

Baghdad Science Journal

An exploratory study of history-based test case prioritization techniques on different datasets

Average Percentage of Fault Detected

Equal Priority

History Based

Random

Regression Testing

Test Case Prioritization

Syed Muhammad Junaid

Dayang N. A.

Johanna

...Show More Authors

In regression testing, Test case prioritization (TCP) is a technique to arrange all the available test cases. TCP techniques can improve fault detection performance which is measured by the average percentage of fault detection (APFD). History-based TCP is one of the TCP techniques that consider the history of past data to prioritize test cases. The issue of equal priority allocation to test cases is a common problem for most TCP techniques. However, this problem has not been explored in history-based TCP techniques. To solve this problem in regression testing, most of the researchers resort to random sorting of test cases. This study aims to investigate equal priority in history-based TCP techniques. The first objective is to implement

View Publication Preview PDF

(3)

(2)

Publication Date

Wed Nov 01 2023

Journal Name

Journal Of Dentistry

The in-vitro development of novel enzyme-based chemo-mechanical caries removal agents

Huda

Shatha A.

Avijit

Lamis A.

...Show More Authors

Objectives Bromelain is a potent proteolytic enzyme that has a unique functionality makes it valuable for various therapeutic purposes. This study aimed to develop three novel formulations based on bromelain to be used as chemomechanical caries removal agents. Methods The novel agents were prepared using different concentrations of bromelain (10–40 wt. %), with and without 0.1–0.3 wt. % chloramine T or 0.5–1.5 wt. % chlorhexidine (CHX). Based on the enzymatic activity test, three formulations were selected; 30 % bromelain (F1), 30 % bromelain-0.1 % chloramine (F2) and 30 % bromelain-1.5 % CHX (F3). The assessments included molecular docking, Fourier-transform infrared spectroscopy (FTIR), viscosity and pH measurements. The efficiency

View Publication

(11)

1 2 ... 99 100 101 102 ... 721 722