Preferred Language
Articles
/
8hYn5IsBVTCNdQwCFON1
Graph based text representation for document clustering
...Show More Authors

Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.

Scopus
Preview PDF
Quick Preview PDF
Publication Date
Fri Oct 02 2015
Journal Name
American Journal Of Applied Sciences
Advances in Document Clustering with Evolutionary-Based Algorithms
...Show More Authors

Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research wor

... Show More
View Publication
Scopus (2)
Crossref (2)
Scopus Crossref
Publication Date
Thu Oct 01 2015
Journal Name
Engineering And Technology Journal
Genetic Based Optimization Models for Enhancing Multi- Document Text Summarization
...Show More Authors

View Publication
Crossref
Publication Date
Sun Jun 01 2008
Journal Name
Baghdad Science Journal
Tamper Detection in Text Document
...Show More Authors

Although text document images authentication is difficult due to the binary nature and clear separation between the background and foreground but it is getting higher demand for many applications. Most previous researches in this field depend on insertion watermark in the document, the drawback in these techniques lie in the fact that changing pixel values in a binary document could introduce irregularities that are very visually noticeable. In this paper, a new method is proposed for object-based text document authentication, in which I propose a different approach where a text document is signed by shifting individual words slightly left or right from their original positions to make the center of gravity for each line fall in with the m

... Show More
View Publication Preview PDF
Crossref
Publication Date
Tue Feb 01 2022
Journal Name
Baghdad Science Journal
Securing Text Messages Using Graph Theory and Steganography
...Show More Authors

      Data security is an important component of data communication and transmission systems. Its main role is to keep sensitive information safe and integrated from the sender to the receiver. The proposed system aims to secure text messages through two security principles encryption and steganography. The system produced a novel method for encryption using graph theory properties; it formed a graph from a password to generate an encryption key as a weight matrix of that graph and invested the Least Significant Bit (LSB) method for hiding the encrypted message in a colored image within a green component. Practical experiments of (perceptibility, capacity, and robustness) were calculated using similarity measures like PSNR, MSE, and

... Show More
View Publication Preview PDF
Scopus (9)
Crossref (4)
Scopus Clarivate Crossref
Publication Date
Tue Mar 11 2025
Journal Name
International Journal Of Data And Network Science
Multi-objective of wind-driven optimization as feature selection and clustering to enhance text clustering
...Show More Authors

Text Clustering consists of grouping objects of similar categories. The initial centroids influence operation of the system with the potential to become trapped in local optima. The second issue pertains to the impact of a huge number of features on the determination of optimal initial centroids. The problem of dimensionality may be reduced by feature selection. Therefore, Wind Driven Optimization (WDO) was employed as Feature Selection to reduce the unimportant words from the text. In addition, the current study has integrated a novel clustering optimization technique called the WDO (Wasp Swarm Optimization) to effectively determine the most suitable initial centroids. The result showed the new meta-heuristic which is WDO was employed as t

... Show More
View Publication Preview PDF
Crossref (1)
Scopus Crossref
Publication Date
Mon May 15 2017
Journal Name
Journal Of Theoretical And Applied Information Technology
Anomaly detection in text data that represented as a graph using dbscan algorithm
...Show More Authors

Anomaly detection is still a difficult task. To address this problem, we propose to strengthen DBSCAN algorithm for the data by converting all data to the graph concept frame (CFG). As is well known that the work DBSCAN method used to compile the data set belong to the same species in a while it will be considered in the external behavior of the cluster as a noise or anomalies. It can detect anomalies by DBSCAN algorithm can detect abnormal points that are far from certain set threshold (extremism). However, the abnormalities are not those cases, abnormal and unusual or far from a specific group, There is a type of data that is do not happen repeatedly, but are considered abnormal for the group of known. The analysis showed DBSCAN using the

... Show More
Preview PDF
Scopus (3)
Scopus
Publication Date
Thu Dec 29 2016
Journal Name
Ibn Al-haitham Journal For Pure And Applied Sciences
Proposal for Exchange Text message Based on Image
...Show More Authors

     The messages are ancient method to exchange information between peoples. It had many ways to send it with some security.

    Encryption and steganography was oldest ways to message security, but there are still many problems in key generation, key distribution, suitable cover image and others. In this paper we present proposed algorithm to exchange security message without any encryption, or image as cover to hidden. Our proposed algorithm depends on two copies of the same collection images set (CIS), one in sender side and other in receiver side which always exchange message between them.

      To send any message text the sender converts message to ASCII c

... Show More
View Publication Preview PDF
Publication Date
Mon Oct 28 2019
Journal Name
Journal Of Mechanics Of Continua And Mathematical Sciences
Heuristic Initialization And Similarity Integration Based Model for Improving Extractive Multi-Document Summarization
...Show More Authors

View Publication
Clarivate Crossref
Publication Date
Sat Jan 02 2021
Journal Name
Journal Of The College Of Languages (jcl)
A Study of Feminist Stylistic Analysis of Language Issues of Gender Representation in Selected Literary text
...Show More Authors

Stylistics is the analysis of the language of literary texts integrated within  various approaches to create a framework of different devices that describe and distinct a particular work. Therefore, feminist stylistics relied on theories of feminist criticism tries to present a counter- image of a woman both in language use and society, to draw attention , raise awareness and change ways that gender represents. Feminist stylistic analysis is related not only to describe sexism in a text, but also to analyze the way that point of view, agency, metaphor, and transitivity choices are  unanticipatedly and carefully connected to issues of gender(Mills,1995:1)            &nb

... Show More
View Publication Preview PDF
Crossref
Publication Date
Mon Feb 21 2022
Journal Name
Iraqi Journal For Computer Science And Mathematics
Fuzzy C means Based Evaluation Algorithms For Cancer Gene Expression Data Clustering
...Show More Authors

The influx of data in bioinformatics is primarily in the form of DNA, RNA, and protein sequences. This condition places a significant burden on scientists and computers. Some genomics studies depend on clustering techniques to group similarly expressed genes into one cluster. Clustering is a type of unsupervised learning that can be used to divide unknown cluster data into clusters. The k-means and fuzzy c-means (FCM) algorithms are examples of algorithms that can be used for clustering. Consequently, clustering is a common approach that divides an input space into several homogeneous zones; it can be achieved using a variety of algorithms. This study used three models to cluster a brain tumor dataset. The first model uses FCM, whic

... Show More
View Publication
Crossref (1)
Crossref