Details

Publication Date

Tue May 30 2023

Journal Name

Iraqi Journal Of Science

Volume

64

Issue Number

5

Choose Citation Style

Statistics

View publication

8

Abstract Views

55

Galley Views

58

Statistics

(1)

Application of Data Mining and Imputation Algorithms for Missing Value Handling: A Study Case Car Evaluation Dataset

C5.0

k-NNI

Data Mining

Missing Value Handling

R Studio

Wahyu Widyananda

Muhammad Fauzan Edy Purnomo

Muhammad Aswin

Panca Mudjirahardjo

Sholeh Hadi Pramono

...Show More Authors

Data mining is a data analysis process using software to find certain patterns or rules in a large amount of data, which is expected to provide knowledge to support decisions. However, missing value in data mining often leads to a loss of information. The purpose of this study is to improve the performance of data classification with missing values, precisely and accurately. The test method is carried out using the Car Evaluation dataset from the UCI Machine Learning Repository. RStudio and RapidMiner tools were used for testing the algorithm. This study will result in a data analysis of the tested parameters to measure the performance of the algorithm. Using test variations: performance at C5.0, C4.5, and k-NN at 0% missing rate, performance at C5.0, C4.5, and k-NN at 5–50% missing rate, performance at C5.0 + k-NNI, C4.5 + k-NNI, and k-NN + k-NNI classifier at 5–50% missing rate, and performance at C5.0 + CMI, C4.5 + CMI, and k-NN + CMI classifier at 5–50% missing rate, The results show that C5.0 with k-NNI produces better classification accuracy than other tested imputation and classification algorithms. For example, with 35% of the dataset missing, this method obtains 93.40% validation accuracy and 92% test accuracy. C5.0 with k-NNI also offers fast processing times compared with other methods.

View Publication Preview PDF

Quick Preview PDF