doi:10.24996/ijs.2023.64.5.32

Details

Publication Date

Tue May 30 2023

Journal Name

Iraqi Journal Of Science

Volume

64

Issue Number

5

DOI

10.24996/ijs.2023.64.5.32

Choose Citation Style

Statistics

View publication

6

Abstract Views

55

Galley Views

58

Statistics

Application of Data Mining and Imputation Algorithms for Missing Value Handling: A Study Case Car Evaluation Dataset

C5.0

k-NNI

Data Mining

Missing Value Handling

R Studio

Wahyu

Muhammad Fauzan Edy

Muhammad

Panca

Sholeh Hadi

...Show More Authors

Data mining is a data analysis process using software to find certain patterns or rules in a large amount of data, which is expected to provide knowledge to support decisions. However, missing value in data mining often leads to a loss of information. The purpose of this study is to improve the performance of data classification with missing values, precisely and accurately. The test method is carried out using the Car Evaluation dataset from the UCI Machine Learning Repository. RStudio and RapidMiner tools were used for testing the algorithm. This study will result in a data analysis of the tested parameters to measure the performance of the algorithm. Using test variations: performance at C5.0, C4.5, and k-NN at 0% missing rate, performance at C5.0, C4.5, and k-NN at 5–50% missing rate, performance at C5.0 + k-NNI, C4.5 + k-NNI, and k-NN + k-NNI classifier at 5–50% missing rate, and performance at C5.0 + CMI, C4.5 + CMI, and k-NN + CMI classifier at 5–50% missing rate, The results show that C5.0 with k-NNI produces better classification accuracy than other tested imputation and classification algorithms. For example, with 35% of the dataset missing, this method obtains 93.40% validation accuracy and 92% test accuracy. C5.0 with k-NNI also offers fast processing times compared with other methods.

View Publication Preview PDF

Quick Preview PDF

Publication Date

Tue Jan 01 2019

Journal Name

Advances On Computational Intelligence In Energy

A Theoretical Framework for Big Data Analytics Based on Computational Intelligent Algorithms with the Potential to Reduce Energy Consumption

H.

U. A.

I. A. T. Hashem

Y.

R. D.

M. M.

G. E.

S.

...Show More Authors

Within the framework of big data, energy issues are highly significant. Despite the significance of energy, theoretical studies focusing primarily on the issue of energy within big data analytics in relation to computational intelligent algorithms are scarce. The purpose of this study is to explore the theoretical aspects of energy issues in big data analytics in relation to computational intelligent algorithms since this is critical in exploring the emperica aspects of big data. In this chapter, we present a theoretical study of energy issues related to applications of computational intelligent algorithms in big data analytics. This work highlights that big data analytics using computational intelligent algorithms generates a very high amo

View Publication

(1)

Publication Date

Sun Sep 03 2023

Journal Name

Wireless Personal Communications

Application of Healthcare Management Technologies for COVID-19 Pandemic Using Internet of Things and Machine Learning Algorithms

Mohammed

...Show More Authors

View Publication

(1)

Publication Date

Wed Jun 01 2022

Journal Name

Baghdad Science Journal

Variable Selection Using aModified Gibbs Sampler Algorithm with Application on Rock Strength Dataset

Ghadeer Jasim

Othman Mahdi

...Show More Authors

Variable selection is an essential and necessary task in the statistical modeling field. Several studies have triedto develop and standardize the process of variable selection, but it isdifficultto do so. The first question a researcher needs to ask himself/herself what are the most significant variables that should be used to describe a given dataset’s response. In thispaper, a new method for variable selection using Gibbs sampler techniqueshas beendeveloped.First, the model is defined, and the posterior distributions for all the parameters are derived.The new variable selection methodis tested usingfour simulation datasets. The new approachiscompared with some existingtechniques: Ordinary Least Squared (OLS), Least Absolute Shrinkage

View Publication Preview PDF

(3)

(1)

Publication Date

Tue Oct 23 2018

Journal Name

Journal Of Economics And Administrative Sciences

Processing of missing values in survey data using Principal Component Analysis and probabilistic Principal Component Analysis methods

قتيبة نبيل

بشرى رحيم

...Show More Authors

The idea of carrying out research on incomplete data came from the circumstances of our dear country and the horrors of war, which resulted in the missing of many important data and in all aspects of economic, natural, health, scientific life, etc.,. The reasons for the missing are different, including what is outside the will of the concerned or be the will of the concerned, which is planned for that because of the cost or risk or because of the lack of possibilities for inspection. The missing data in this study were processed using Principal Component Analysis and self-organizing map methods using simulation. The variables of child health and variables affecting children's health were taken into account: breastfeed

View Publication Preview PDF

Publication Date

Sat Jan 01 2022

Journal Name

Iranian Journal Of Earth Sciences

Resistivity surveys application for detection of shallow caves in a case example from Western Iraq

Ali M.

Kamal Kareem

Asama H.

...Show More Authors

(1)

Publication Date

Wed May 11 2022

Journal Name

Journal Of Economics And Administrative Sciences

Comparing Some Methods For A single Imputed A missing Observation In Estimating Nonparametric Regression Function

Comparison of single-value compensation methodsLost to the model of the regression

قتيبة

مناف

...Show More Authors

In this paper, we will study non parametric model when the response variable have missing data (non response) in observations it under missing mechanisms MCAR, then we suggest Kernel-Based Non-Parametric Single-Imputation instead of missing value and compare it with Nearest Neighbor Imputation by using the simulation about some difference models and with difference cases as the sample size, variance and rate of missing data.

View Publication Preview PDF

Publication Date

Sat Jan 20 2024

Journal Name

Ibn Al-haitham Journal For Pure And Applied Sciences

Enhanced Support Vector Machine Methods Using Stochastic Gradient Descent and Its Application to Heart Disease Dataset

SVM

classification

reduction of dimensions

variables selection

gradient descent

heart disease

Ghadeer

Seror Faeq Mohammed

Md Kamrul Hasan Khan

...Show More Authors

Support Vector Machines (SVMs) are supervised learning models used to examine data sets in order to classify or predict dependent variables. SVM is typically used for classification by determining the best hyperplane between two classes. However, working with huge datasets can lead to a number of problems, including time-consuming and inefficient solutions. This research updates the SVM by employing a stochastic gradient descent method. The new approach, the extended stochastic gradient descent SVM (ESGD-SVM), was tested on two simulation datasets. The proposed method was compared with other classification approaches such as logistic regression, naive model, K Nearest Neighbors and Random Forest. The results show that the ESGD-SVM has a

View Publication Preview PDF

Publication Date

Thu Dec 01 2022

Journal Name

International Journal Of Electrical And Computer Engineering (ijece)

A self-balancing platform on a mobile car

Bushra Amer

Maher Yahya

Ahmed

...Show More Authors

<span lang="EN-US">In the last years, the self-balancing platform has become one of the most common candidates to use in many applications such as flight, biomedical fields, and industry. In this paper, the physical prototype of a proposed self-balancing platform that described the self-balancing attitude in the (X-axis, Y-axis, or biaxial) under the influence of road disturbance has been introduced. In the physical prototype, the inertial measurement unit (IMU) sensor will sense the disturbance in (X-axis, Y-axis, and biaxial). With the determined error, the corresponding electronic circuit, DC servo motors, and the Arduino software, the platform overcame the tilt angle(disturbance). Optimization of the proportional-integral-

View Publication

(4)

(1)

Publication Date

Fri Nov 20 2020

Journal Name

Solid State Technology

Comparative Study for Bi-Clustering Algorithms: Historical and Methodological Notes

Safa S

Hiba S

Saif S

...Show More Authors

View Publication

Publication Date

Sat Jun 01 2019

Journal Name

Journal Of Economics And Administrative Sciences

Using Some Robust Methods For Handling the Problem of Multicollinearity

/ الانحدار الخطي المتعدد

التعدد الخطي

القيم الشاذة

مقدر LTS

مقدر Liu

انحدار الحرف .

Multiple Linear Regression

Multicollinearity

outliers

ridge regression

LTS-estimator

Liu-estimator.

غفران اسماعيل

سيف الامام سعدي

...Show More Authors

The multiple linear regression model is an important regression model that has attracted many researchers in different fields including applied mathematics, business, medicine, and social sciences , Linear regression models involving a large number of independent variables are poorly performing due to large variation and lead to inaccurate conclusions , One of the most important problems in the regression analysis is the multicollinearity Problem, which is considered one of the most important problems that has become known to many researchers , As well as their effects on the multiple linear regression model, In addition to multicollinearity, the problem of outliers in data is one of the difficulties in constructing the reg

View Publication Preview PDF

1 2 ... 6 7 8 9 ... 3279 3280