Preferred Language
Articles
/
ijs-2747
A Parallel Clustering Analysis Based on Hadoop Multi-Node and Apache Mahout

     The conventional procedures of clustering algorithms are incapable of overcoming the difficulty of managing and analyzing the rapid growth of generated data from different sources. Using the concept of parallel clustering is one of the robust solutions to this problem. Apache Hadoop architecture is one of the assortment ecosystems that provide the capability to store and process the data in a distributed and parallel fashion. In this paper, a parallel model is designed to process the k-means clustering algorithm in the Apache Hadoop ecosystem by connecting three nodes, one is for server (name) nodes and the other two are for clients (data) nodes. The aim is to speed up the time of managing the massive scale of healthcare insurance dataset with the size of 11 GB and also using machine learning algorithms, which are provided by the Mahout Framework. The experimental results depict that the proposed model can efficiently process large datasets. The parallel k-means algorithm outperforms the sequential k-means algorithm based on the execution time of the algorithm, where the required time to execute a data size of 11 GB is around 1.847 hours using the parallel k-means algorithm, while it equals 68.567 hours using the sequential k-means algorithm. As a result, we deduce that when the nodes number in the parallel system increases, the computation time of the proposed algorithm decreases.

Scopus Crossref
View Publication Preview PDF
Quick Preview PDF
Publication Date
Fri Aug 05 2016
Journal Name
Wireless Communications And Mobile Computing
A comparison study on node clustering techniques used in target tracking WSNs for efficient data aggregation

Wireless sensor applications are susceptible to energy constraints. Most of the energy is consumed in communication between wireless nodes. Clustering and data aggregation are the two widely used strategies for reducing energy usage and increasing the lifetime of wireless sensor networks. In target tracking applications, large amount of redundant data is produced regularly. Hence, deployment of effective data aggregation schemes is vital to eliminate data redundancy. This work aims to conduct a comparative study of various research approaches that employ clustering techniques for efficiently aggregating data in target tracking applications as selection of an appropriate clustering algorithm may reflect positive results in the data aggregati

... Show More
Scopus (30)
Crossref (23)
Scopus Clarivate Crossref
View Publication
Publication Date
Mon Aug 01 2022
Journal Name
Baghdad Science Journal
New and Existing Approaches Reviewing of Big Data Analysis with Hadoop Tools

Everybody is connected with social media like (Facebook, Twitter, LinkedIn, Instagram…etc.) that generate a large quantity of data and which traditional applications are inadequate to process. Social media are regarded as an important platform for sharing information, opinion, and knowledge of many subscribers. These basic media attribute Big data also to many issues, such as data collection, storage, moving, updating, reviewing, posting, scanning, visualization, Data protection, etc. To deal with all these problems, this is a need for an adequate system that not just prepares the details, but also provides meaningful analysis to take advantage of the difficult situations, relevant to business, proper decision, Health, social media, sc

... Show More
Scopus (2)
Scopus Clarivate Crossref
View Publication Preview PDF
Publication Date
Tue Nov 19 2024
Journal Name
International Journal Of Data And Network Science
Multi-objective of wind-driven optimization as feature selection and clustering to enhance text clustering

Text Clustering consists of grouping objects of similar categories. The initial centroids influence operation of the system with the potential to become trapped in local optima. The second issue pertains to the impact of a huge number of features on the determination of optimal initial centroids. The problem of dimensionality may be reduced by feature selection. Therefore, Wind Driven Optimization (WDO) was employed as Feature Selection to reduce the unimportant words from the text. In addition, the current study has integrated a novel clustering optimization technique called the WDO (Wasp Swarm Optimization) to effectively determine the most suitable initial centroids. The result showed the new meta-heuristic which is WDO was employed as t

... Show More
Crossref (1)
Scopus Crossref
View Publication Preview PDF
Publication Date
Sun Jul 04 2010
Journal Name
Journal Of The Faculty Of Medicine Baghdad
CT Image Segmentation Based on clustering Methods.

Background: image processing of medical images is major method to increase reliability of cancer diagnosis.
Methods: The proposed system proceeded into two stages: First, enhancement stage which was performed using of median filter to reduce the noise and artifacts that present in a CT image of a human lung with a cancer, Second: implementation of k-means clustering algorithm.
Results: the result image of k-means algorithm compared with the image resulted from implementation of fuzzy c-means (FCM) algorithm.
Conclusion: We found that the time required for k-means algorithm implementation is less than that of FCM algorithm.MATLAB package (version 7.3) was used in writing the programming code of our w

... Show More
Crossref
View Publication Preview PDF
Publication Date
Sat Nov 01 2008
Journal Name
Digital Signal Processing
A high performance parallel Radon based OFDM transceiver design and simulation

major goal of the next-generation wireless communication systems is the development of a reliable high-speed wireless communication system that supports high user mobility. They must focus on increasing the link throughput and the network capacity. In this paper a novel, spectral efficient system is proposed for generating and transmitting twodimensional (2-D) orthogonal frequency division multiplexing (OFDM) symbols through 2- D inter-symbol interference (ISI) channel. Instead of conventional data mapping techniques, discrete finite Radon transform (FRAT) is used as a data mapping technique due to the increased orthogonality offered. As a result, the proposed structure gives a significant improvement in bit error rate (BER) performance. Th

... Show More
Scopus (8)
Crossref (5)
Scopus Clarivate Crossref
View Publication Preview PDF
Publication Date
Fri Jan 20 2023
Journal Name
Ibn Al-haitham Journal For Pure And Applied Sciences
Estimation of a Parallel Stress-strength Model Based on the Inverse Kumaraswamy Distribution

   

 The reliability of the stress-strength model attracted many statisticians for several years owing to its applicability in different and diverse parts such as engineering, quality control, and economics. In this paper, the system reliability estimation in the stress-strength model containing Kth parallel components will be offered by four types of shrinkage methods: constant Shrinkage Estimation Method, Shrinkage Function Estimator, Modified Thompson Type Shrinkage Estimator, Squared Shrinkage Estimator. The Monte Carlo simulation study is compared among proposed estimators using the mean squared error. The result analyses of the shrinkage estimation methods showed that the shrinkage functions estimator was the best since

... Show More
Crossref
View Publication Preview PDF
Publication Date
Sun Apr 23 2017
Journal Name
International Conference Of Reliable Information And Communication Technology
Classification of Arabic Writer Based on Clustering Techniques

Arabic text categorization for pattern recognitions is challenging. We propose for the first time a novel holistic method based on clustering for classifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Secondly, their structural and statistical features are obtained from sectioned portions. Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generatio

... Show More
Scopus (5)
Scopus
Publication Date
Sun Dec 01 2013
Journal Name
Diyala Journal Of Engineering Sciences
Design and Simulation of parallel CDMA System Based on 3D-Hadamard Transform

Future wireless systems aim to provide higher transmission data rates, improved spectral efficiency and greater capacity. In this paper a spectral efficient two dimensional (2-D) parallel code division multiple access (CDMA) system is proposed for generating and transmitting (2-D CDMA) symbols through 2-D Inter-Symbol Interference (ISI) channel to increase the transmission speed. The 3D-Hadamard matrix is used to generate the 2-D spreading codes required to spread the two-dimensional data for each user row wise and column wise. The quadrature amplitude modulation (QAM) is used as a data mapping technique due to the increased spectral efficiency offered. The new structure simulated using MATLAB and a comparison of performance for ser

... Show More
Crossref
View Publication Preview PDF
Publication Date
Sat Jan 01 2022
Journal Name
Ssrn Electronic Journal
Crossref (2)
Crossref
View Publication
Publication Date
Mon Jan 10 2022
Journal Name
Iraqi Journal Of Science
Genetic Algorithm based Clustering for Intrusion Detection

Clustering algorithms have recently gained attention in the related literature since
they can help current intrusion detection systems in several aspects. This paper
proposes genetic algorithm (GA) based clustering, serving to distinguish patterns
incoming from network traffic packets into normal and attack. Two GA based
clustering models for solving intrusion detection problem are introduced. The first
model coined as handles numeric features of the network packet, whereas
the second one coined as concerns all features of the network packet.
Moreover, a new mutation operator directed for binary and symbolic features is
proposed. The basic concept of proposed mutation operator depends on the most
frequent value

... Show More
View Publication Preview PDF