The conventional procedures of clustering algorithms are incapable of overcoming the difficulty of managing and analyzing the rapid growth of generated data from different sources. Using the concept of parallel clustering is one of the robust solutions to this problem. Apache Hadoop architecture is one of the assortment ecosystems that provide the capability to store and process the data in a distributed and parallel fashion. In this paper, a parallel model is designed to process the k-means clustering algorithm in the Apache Hadoop ecosystem by connecting three nodes, one is for server (name) nodes and the other two are for clients (data) nodes. The aim is to speed up the time of managing the massive scale of healthcare insurance dataset with the size of 11 GB and also using machine learning algorithms, which are provided by the Mahout Framework. The experimental results depict that the proposed model can efficiently process large datasets. The parallel k-means algorithm outperforms the sequential k-means algorithm based on the execution time of the algorithm, where the required time to execute a data size of 11 GB is around 1.847 hours using the parallel k-means algorithm, while it equals 68.567 hours using the sequential k-means algorithm. As a result, we deduce that when the nodes number in the parallel system increases, the computation time of the proposed algorithm decreases.
Objective This research investigates Breast Cancer real data for Iraqi women, these data are acquired manually from several Iraqi Hospitals of early detection for Breast Cancer. Data mining techniques are used to discover the hidden knowledge, unexpected patterns, and new rules from the dataset, which implies a large number of attributes. Methods Data mining techniques manipulate the redundant or simply irrelevant attributes to discover interesting patterns. However, the dataset is processed via Weka (The Waikato Environment for Knowledge Analysis) platform. The OneR technique is used as a machine learning classifier to evaluate the attribute worthy according to the class value. Results The evaluation is performed using
... Show MoreMachine learning-based techniques are used widely for the classification of images into various categories. The advancement of Convolutional Neural Network (CNN) affects the field of computer vision on a large scale. It has been applied to classify and localize objects in images. Among the fields of applications of CNN, it has been applied to understand huge unstructured astronomical data being collected every second. Galaxies have diverse and complex shapes and their morphology carries fundamental information about the whole universe. Studying these galaxies has been a tremendous task for the researchers around the world. Researchers have already applied some basic CNN models to predict the morphological classes
... Show MoreIn this paper, we define a cubic positive implicative-ideal, a cubic implicative-ideal and a cubic commutative-ideal of a semigroup in KU-algebra as a generalization of a fuzzy (positive implicative-ideal, an implicative-ideal and a commutative-ideal) of a semigroup in KU-algebra. Some relations between these types of cubic ideals are discussed. Also, some important properties of these ideals are studied. Finally, some important theories are discussed. It is proved that every cubic commutative-ideal, cubic positive implicative-ideal, and cubic implicative-ideal are a cubic ideal, but not conversely. Also, we show that if Θ is a cubic positive implicative-ideal and a cubic commutative-ideal then Θ is a cubic implicative-ideal. Some exam
... Show MoreMultivariate Non-Parametric control charts were used to monitoring the data that generated by using the simulation, whether they are within control limits or not. Since that non-parametric methods do not require any assumptions about the distribution of the data. This research aims to apply the multivariate non-parametric quality control methods, which are Multivariate Wilcoxon Signed-Rank ( ) , kernel principal component analysis (KPCA) and k-nearest neighbor ( −
This paper discusses estimating the two scale parameters of Exponential-Rayleigh distribution for singly type one censored data which is one of the most important Rights censored data, using the maximum likelihood estimation method (MLEM) which is one of the most popular and widely used classic methods, based on an iterative procedure such as the Newton-Raphson to find estimated values for these two scale parameters by using real data for COVID-19 was taken from the Iraqi Ministry of Health and Environment, AL-Karkh General Hospital. The duration of the study was in the interval 4/5/2020 until 31/8/2020 equivalent to 120 days, where the number of patients who entered the (study) hospital with sample size is (n=785). The number o
... Show MoreIn this research، a comparison has been made between the robust estimators of (M) for the Cubic Smoothing Splines technique، to avoid the problem of abnormality in data or contamination of error، and the traditional estimation method of Cubic Smoothing Splines technique by using two criteria of differentiation which are (MADE، WASE) for different sample sizes and disparity levels to estimate the chronologically different coefficients functions for the balanced longitudinal data which are characterized by observations obtained through (n) from the independent subjects، each one of them is measured repeatedly by group of specific time points (m)،since the frequent measurements within the subjects are almost connected an
... Show MoreThis paper presents a method to organize memory chips when they are used to build memory systems that have word size wider than 8-bit. Most memory chips have 8-bit word size. When the memory system has to be built from several memory chips of various sizes, this method gives all possible organizations of these chips in the memory system. This paper also suggests a precise definition of the term “memory bank” that is usually used in memory systems. Finally, an illustrative design problem was taken to illustrate the presented method practically
In this article, it is interesting to estimate and derive the three parameters which contain two scales parameters and one shape parameter of a new mixture distribution for the singly type one censored data which is the branch of right censored sample. Then to define some special mathematical and statistical properties for this new mixture distribution which is considered one of the continuous distributions characterized by its flexibility. Next, using maximum likelihood estimator method for singly type one censored data based on the Newton-Raphson matrix procedure to find and estimate values of these three parameter by utilizing the real data taken from the National Center for Research and Treatment of Hematology/University of Mus
... Show MoreIn this paper, we used four classification methods to classify objects and compareamong these methods, these are K Nearest Neighbor's (KNN), Stochastic Gradient Descentlearning (SGD), Logistic Regression Algorithm(LR), and Multi-Layer Perceptron (MLP). Weused MCOCO dataset for classification and detection the objects, these dataset image wererandomly divided into training and testing datasets at a ratio of 7:3, respectively. In randomlyselect training and testing dataset images, converted the color images to the gray level, thenenhancement these gray images using the histogram equalization method, resize (20 x 20) fordataset image. Principal component analysis (PCA) was used for feature extraction, andfinally apply four classification metho
... Show MoreThese days, it is crucial to discern between different types of human behavior, and artificial intelligence techniques play a big part in that. The characteristics of the feedforward artificial neural network (FANN) algorithm and the genetic algorithm have been combined to create an important working mechanism that aids in this field. The proposed system can be used for essential tasks in life, such as analysis, automation, control, recognition, and other tasks. Crossover and mutation are the two primary mechanisms used by the genetic algorithm in the proposed system to replace the back propagation process in ANN. While the feedforward artificial neural network technique is focused on input processing, this should be based on the proce
... Show More