Article - ijs-2747 - Digital Repository

Details

Publication Date

Sat Jul 31 2021

Journal Name

Iraqi Journal Of Science

Issue Number

DOI

10.24996/ijs.2021.62.7.32

Keywords

Big Data

Hadoop

Mahout

Predictive Analytics

Parallel K-means

Choose Citation Style

Statistics

Abstract Views

217

Galley Views

256

Statistics

(3)

(1)

Authors (2)

Noor S.

Suhad A.

A Parallel Clustering Analysis Based on Hadoop Multi-Node and Apache Mahout

The conventional procedures of clustering algorithms are incapable of overcoming the difficulty of managing and analyzing the rapid growth of generated data from different sources. Using the concept of parallel clustering is one of the robust solutions to this problem. Apache Hadoop architecture is one of the assortment ecosystems that provide the capability to store and process the data in a distributed and parallel fashion. In this paper, a parallel model is designed to process the k-means clustering algorithm in the Apache Hadoop ecosystem by connecting three nodes, one is for server (name) nodes and the other two are for clients (data) nodes. The aim is to speed up the time of managing the massive scale of healthcare insurance dataset with the size of 11 GB and also using machine learning algorithms, which are provided by the Mahout Framework. The experimental results depict that the proposed model can efficiently process large datasets. The parallel k-means algorithm outperforms the sequential k-means algorithm based on the execution time of the algorithm, where the required time to execute a data size of 11 GB is around 1.847 hours using the parallel k-means algorithm, while it equals 68.567 hours using the sequential k-means algorithm. As a result, we deduce that when the nodes number in the parallel system increases, the computation time of the proposed algorithm decreases.

View Publication Preview PDF

Quick Preview PDF

Publication Date

Fri Aug 05 2016

Journal Name

Wireless Communications And Mobile Computing

A comparison study on node clustering techniques used in target tracking WSNs for efficient data aggregation

Wireless sensor applications are susceptible to energy constraints. Most of the energy is consumed in communication between wireless nodes. Clustering and data aggregation are the two widely used strategies for reducing energy usage and increasing the lifetime of wireless sensor networks. In target tracking applications, large amount of redundant data is produced regularly. Hence, deployment of effective data aggregation schemes is vital to eliminate data redundancy. This work aims to conduct a comparative study of various research approaches that employ clustering techniques for efficiently aggregating data in target tracking applications as selection of an appropriate clustering algorithm may reflect positive results in the data aggregati

(30)

(23)

Authors (7)

Omar Adil

Ainuddin Wahid

Mohd. Yamani

View All

View Publication

Publication Date

Mon Aug 01 2022

Journal Name

Baghdad Science Journal

New and Existing Approaches Reviewing of Big Data Analysis with Hadoop Tools

Everybody is connected with social media like (Facebook, Twitter, LinkedIn, Instagram…etc.) that generate a large quantity of data and which traditional applications are inadequate to process. Social media are regarded as an important platform for sharing information, opinion, and knowledge of many subscribers. These basic media attribute Big data also to many issues, such as data collection, storage, moving, updating, reviewing, posting, scanning, visualization, Data protection, etc. To deal with all these problems, this is a need for an adequate system that not just prepares the details, but also provides meaningful analysis to take advantage of the difficult situations, relevant to business, proper decision, Health, social media, sc

(2)

Authors (2)

Watheq Ghanim

Abbas Fadhil

View Publication Preview PDF

Publication Date

Tue Nov 19 2024

Journal Name

International Journal Of Data And Network Science

Multi-objective of wind-driven optimization as feature selection and clustering to enhance text clustering

Text Clustering consists of grouping objects of similar categories. The initial centroids influence operation of the system with the potential to become trapped in local optima. The second issue pertains to the impact of a huge number of features on the determination of optimal initial centroids. The problem of dimensionality may be reduced by feature selection. Therefore, Wind Driven Optimization (WDO) was employed as Feature Selection to reduce the unimportant words from the text. In addition, the current study has integrated a novel clustering optimization technique called the WDO (Wasp Swarm Optimization) to effectively determine the most suitable initial centroids. The result showed the new meta-heuristic which is WDO was employed as t

(1)

Authors (3)

MEHDI G. DUAIMI

Bsoul,Q.

AL-Gburi, A.

View Publication Preview PDF

Publication Date

Sun Jul 04 2010

Journal Name

Journal Of The Faculty Of Medicine Baghdad

CT Image Segmentation Based on clustering Methods.

Background: image processing of medical images is major method to increase reliability of cancer diagnosis.
Methods: The proposed system proceeded into two stages: First, enhancement stage which was performed using of median filter to reduce the noise and artifacts that present in a CT image of a human lung with a cancer, Second: implementation of k-means clustering algorithm.
Results: the result image of k-means algorithm compared with the image resulted from implementation of fuzzy c-means (FCM) algorithm.
Conclusion: We found that the time required for k-means algorithm implementation is less than that of FCM algorithm.MATLAB package (version 7.3) was used in writing the programming code of our w

Authors (2)

Rand K.

Asmaa A.

View Publication Preview PDF

Publication Date

Sat Nov 01 2008

Journal Name

Digital Signal Processing

A high performance parallel Radon based OFDM transceiver design and simulation

major goal of the next-generation wireless communication systems is the development of a reliable high-speed wireless communication system that supports high user mobility. They must focus on increasing the link throughput and the network capacity. In this paper a novel, spectral efficient system is proposed for generating and transmitting twodimensional (2-D) orthogonal frequency division multiplexing (OFDM) symbols through 2- D inter-symbol interference (ISI) channel. Instead of conventional data mapping techniques, discrete finite Radon transform (FRAT) is used as a data mapping technique due to the increased orthogonality offered. As a result, the proposed structure gives a significant improvement in bit error rate (BER) performance. Th

(8)

(5)

Authors (4)

Waleed

Abbas Hasan

Sulaiman M.

View All

View Publication Preview PDF

Publication Date

Fri Jan 20 2023

Journal Name

Ibn Al-haitham Journal For Pure And Applied Sciences

Estimation of a Parallel Stress-strength Model Based on the Inverse Kumaraswamy Distribution

The reliability of the stress-strength model attracted many statisticians for several years owing to its applicability in different and diverse parts such as engineering, quality control, and economics. In this paper, the system reliability estimation in the stress-strength model containing Kth parallel components will be offered by four types of shrinkage methods: constant Shrinkage Estimation Method, Shrinkage Function Estimator, Modified Thompson Type Shrinkage Estimator, Squared Shrinkage Estimator. The Monte Carlo simulation study is compared among proposed estimators using the mean squared error. The result analyses of the shrinkage estimation methods showed that the shrinkage functions estimator was the best since

Authors (4)

Bayda A.

Bsma

Abbas N. Salman

View All

View Publication Preview PDF

Publication Date

Sun Apr 23 2017

Journal Name

International Conference Of Reliable Information And Communication Technology

Classification of Arabic Writer Based on Clustering Techniques

Arabic text categorization for pattern recognitions is challenging. We propose for the first time a novel holistic method based on clustering for classifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Secondly, their structural and statistical features are obtained from sectioned portions. Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generatio

(5)

Authors (1)

Mohammed

Publication Date

Sun Dec 01 2013

Journal Name

Diyala Journal Of Engineering Sciences

Design and Simulation of parallel CDMA System Based on 3D-Hadamard Transform

Future wireless systems aim to provide higher transmission data rates, improved spectral efficiency and greater capacity. In this paper a spectral efficient two dimensional (2-D) parallel code division multiple access (CDMA) system is proposed for generating and transmitting (2-D CDMA) symbols through 2-D Inter-Symbol Interference (ISI) channel to increase the transmission speed. The 3D-Hadamard matrix is used to generate the 2-D spreading codes required to spread the two-dimensional data for each user row wise and column wise. The quadrature amplitude modulation (QAM) is used as a data mapping technique due to the increased spectral efficiency offered. The new structure simulated using MATLAB and a comparison of performance for ser

Authors (1)

Ali

View Publication Preview PDF

Publication Date

Sat Jan 01 2022

Journal Name

Ssrn Electronic Journal

Developing a Predictive Model and Multi-Objective Optimization of a Photovoltaic/Thermal System Based on Energy and Exergy Analysis Using Response Surface Methodology

(2)

Authors (5)

Koorosh

Hayder I.

Jasim M.

View All

View Publication

Publication Date

Mon Jan 10 2022

Journal Name

Iraqi Journal Of Science

Genetic Algorithm based Clustering for Intrusion Detection

Clustering algorithms have recently gained attention in the related literature since
they can help current intrusion detection systems in several aspects. This paper
proposes genetic algorithm (GA) based clustering, serving to distinguish patterns
incoming from network traffic packets into normal and attack. Two GA based
clustering models for solving intrusion detection problem are introduced. The first
model coined as handles numeric features of the network packet, whereas
the second one coined as concerns all features of the network packet.
Moreover, a new mutation operator directed for binary and symbolic features is
proposed. The basic concept of proposed mutation operator depends on the most
frequent value

Authors (2)

Noor

Sarab M.

View Publication Preview PDF

1 2 3 4 ... 997 998 999 1000