Preferred Language
Articles
/
ijs-2747
A Parallel Clustering Analysis Based on Hadoop Multi-Node and Apache Mahout
...Show More Authors

     The conventional procedures of clustering algorithms are incapable of overcoming the difficulty of managing and analyzing the rapid growth of generated data from different sources. Using the concept of parallel clustering is one of the robust solutions to this problem. Apache Hadoop architecture is one of the assortment ecosystems that provide the capability to store and process the data in a distributed and parallel fashion. In this paper, a parallel model is designed to process the k-means clustering algorithm in the Apache Hadoop ecosystem by connecting three nodes, one is for server (name) nodes and the other two are for clients (data) nodes. The aim is to speed up the time of managing the massive scale of healthcare insurance dataset with the size of 11 GB and also using machine learning algorithms, which are provided by the Mahout Framework. The experimental results depict that the proposed model can efficiently process large datasets. The parallel k-means algorithm outperforms the sequential k-means algorithm based on the execution time of the algorithm, where the required time to execute a data size of 11 GB is around 1.847 hours using the parallel k-means algorithm, while it equals 68.567 hours using the sequential k-means algorithm. As a result, we deduce that when the nodes number in the parallel system increases, the computation time of the proposed algorithm decreases.

Scopus Crossref
View Publication Preview PDF
Quick Preview PDF
Publication Date
Mon May 11 2020
Journal Name
Baghdad Science Journal
Proposing Robust LAD-Atan Penalty of Regression Model Estimation for High Dimensional Data
...Show More Authors

         The issue of penalized regression model has received considerable critical attention to variable selection. It plays an essential role in dealing with high dimensional data. Arctangent denoted by the Atan penalty has been used in both estimation and variable selection as an efficient method recently. However, the Atan penalty is very sensitive to outliers in response to variables or heavy-tailed error distribution. While the least absolute deviation is a good method to get robustness in regression estimation. The specific objective of this research is to propose a robust Atan estimator from combining these two ideas at once. Simulation experiments and real data applications show that the p

... Show More
View Publication Preview PDF
Scopus (4)
Crossref (1)
Scopus Clarivate Crossref
Publication Date
Sun Oct 01 2017
Journal Name
Diyala Journal For Pure Science
Employing difference technique in some Liu estimators to semiparametric regression model
...Show More Authors

Semiparametric methods combined parametric methods and nonparametric methods ,it is important in most of studies which take in it's nature more progress in the procedure of accurate statistical analysis which aim getting estimators efficient, the partial linear regression model is considered the most popular type of semiparametric models, which consisted of parametric component and nonparametric component in order to estimate the parametric component that have certain properties depend on the assumptions concerning the parametric component, where the absence of assumptions, parametric component will have several problems for example multicollinearity means (explanatory variables are interrelated to each other) , To treat this problem we use

... Show More
View Publication
Crossref
Publication Date
Sun May 17 2020
Journal Name
Iraqi Journal Of Science
Multicomponent Inverse Lomax Stress-Strength Reliability
...Show More Authors

In this article we derive two reliability mathematical expressions of two kinds of s-out of -k stress-strength model systems; and . Both stress and strength are assumed to have an Inverse Lomax distribution with unknown shape parameters and a common known scale parameter. The increase and decrease in the real values of the two reliabilities are studied according to the increase and decrease in the distribution parameters. Two estimation methods are used to estimate the distribution parameters and the reliabilities, which are Maximum Likelihood and Regression. A comparison is made between the estimators based on a simulation study by the mean squared error criteria, which revealed that the maximum likelihood estimator works the best.

View Publication Preview PDF
Scopus (4)
Scopus Crossref
Publication Date
Fri Mar 01 2013
Journal Name
Journal Of Economics And Administrative Sciences
Stability testing of time series data for CT Large industrial establishments in Iraq
...Show More Authors

Abstract: -
The concept of joint integration of important concepts in macroeconomic application, the idea of ​​cointegration is due to the Granger (1981), and he explained it in detail in Granger and Engle in Econometrica (1987). The introduction of the joint analysis of integration in econometrics in the mid-eighties of the last century, is one of the most important developments in the experimental method for modeling, and the advantage is simply the account and use it only needs to familiarize them selves with ordinary least squares.

Cointegration seen relations equilibrium time series in the long run, even if it contained all the sequences on t

... Show More
View Publication Preview PDF
Crossref
Publication Date
Thu Feb 01 2024
Journal Name
Baghdad Science Journal
Estimating the Parameters of Exponential-Rayleigh Distribution for Progressively Censoring Data with S- Function about COVID-19
...Show More Authors

The two parameters of Exponential-Rayleigh distribution were estimated using the maximum likelihood estimation method (MLE) for progressively censoring data. To find estimated values for these two scale parameters using real data for COVID-19 which was taken from the Iraqi Ministry of Health and Environment, AL-Karkh General Hospital. Then the Chi-square test was utilized to determine if the sample (data) corresponded with the Exponential-Rayleigh distribution (ER). Employing the nonlinear membership function (s-function) to find fuzzy numbers for these parameters estimators. Then utilizing the ranking function transforms the fuzzy numbers into crisp numbers. Finally, using mean square error (MSE) to compare the outcomes of the survival

... Show More
View Publication Preview PDF
Scopus Crossref
Publication Date
Sat Jan 01 2022
Journal Name
The International Journal Of Nonlinear Analysis And Applications
Developing Bulk Arrival Queuing Models with Constant Batch Policy Under Uncertainty Data Using (0-1) Variables
...Show More Authors

This paper delves into some significant performance measures (PMs) of a bulk arrival queueing system with constant batch size b, according to arrival rates and service rates being fuzzy parameters. The bulk arrival queuing system deals with observation arrival into the queuing system as a constant group size before allowing individual customers entering to the service. This leads to obtaining a new tool with the aid of generating function methods. The corresponding traditional bulk queueing system model is more convenient under an uncertain environment. The α-cut approach is applied with the conventional Zadeh's extension principle (ZEP) to transform the triangular membership functions (Mem. Fs) fuzzy queues into a family of conventional b

... Show More
Publication Date
Fri Sep 30 2022
Journal Name
Iraqi Journal Of Science
Educational Data Mining For Predicting Academic Student Performance Using Active Classification
...Show More Authors

     The increasing amount of educational data has rapidly in the latest few years. The Educational Data Mining (EDM) techniques are utilized to detect the valuable pattern so that improves the educational process and to obtain high performance of all educational elements. The proposed work contains three stages: preprocessing, features selection, and an active classification stage. The dataset was collected using EDM that had a lack in the label data, it contained 2050 records collected by using questionnaires and by using the students’ academic records. There are twenty-five features that were combined from the following five factors: (curriculum, teacher, student, the environment of education, and the family). Active learning ha

... Show More
View Publication Preview PDF
Scopus (2)
Crossref (2)
Scopus Crossref
Publication Date
Wed May 17 2017
Journal Name
Ibn Al-haitham Journal For Pure And Applied Sciences
Evaluation of Thermal Reactor Fission Products Cross Sections
...Show More Authors

      The production of fission products during reactor operation has a very important effect on  reactor reactivity .Results of neutron cross section evaluations are presented for the main product nuclides considered as being the most important  for reactor calculation and burn-up consideration . Data from the main international libraries considered as containing the most up-to-date nuclear data   and the latest experimental measurements are considered in the evaluation processes, we describe the evaluated cross sections of the fission product nuclides by making inter comparison of the data and point out the discrepancies among libraries.

 

View Publication Preview PDF
Publication Date
Thu May 18 2023
Journal Name
Journal Of Engineering
Spatial Prediction of Monthly Precipitation in Sulaimani Governorate using Artificial Neural Network Models
...Show More Authors

ANN modeling is used here to predict missing monthly precipitation data in one station of the eight weather stations network in Sulaimani Governorate. Eight models were developed, one for each station as for prediction. The accuracy of prediction obtain is excellent with correlation coefficients between the predicted and the measured values of monthly precipitation ranged from (90% to 97.2%). The eight ANN models are found after many trials for each station and those with the highest correlation coefficient were selected. All the ANN models are found to have a hyperbolic tangent and identity activation functions for the hidden and output layers respectively, with learning rate of (0.4) and momentum term of (0.9), but with different data

... Show More
View Publication Preview PDF
Crossref
Publication Date
Thu Aug 30 2018
Journal Name
Iraqi Journal Of Science
Seismic Data Processing of Subba Oil Field in South Iraq
...Show More Authors

Evaluation study was conducted for seismic interpretation using two-dimensional seismic data for Subba oil field, which is located in the southern Iraq. The Subba oil field was discovered in 1973 through the results of the seismic surveys and the digging of the first exploratory well SU-1 in 1975 to the south of the Subba oil field. The entire length of the field is 35 km and its width is about 10 km. The Subba oil field contains 15 wells most of them distributed in the central of the field.

     This study is dealing with the field data and how to process it for the purpose of interpretation; the processes included conversion of field data format, compensation of lost data and noise disposal, as well as the a

... Show More
View Publication Preview PDF