doi:10.24996/ijs.2023.64.5.32

Details

Publication Date

Tue May 30 2023

Journal Name

Iraqi Journal Of Science

Volume

64

Issue Number

5

DOI

10.24996/ijs.2023.64.5.32

Choose Citation Style

Statistics

View publication

6

Abstract Views

55

Galley Views

58

Statistics

Application of Data Mining and Imputation Algorithms for Missing Value Handling: A Study Case Car Evaluation Dataset

C5.0

k-NNI

Data Mining

Missing Value Handling

R Studio

Wahyu

Muhammad Fauzan Edy

Muhammad

Panca

Sholeh Hadi

...Show More Authors

Data mining is a data analysis process using software to find certain patterns or rules in a large amount of data, which is expected to provide knowledge to support decisions. However, missing value in data mining often leads to a loss of information. The purpose of this study is to improve the performance of data classification with missing values, precisely and accurately. The test method is carried out using the Car Evaluation dataset from the UCI Machine Learning Repository. RStudio and RapidMiner tools were used for testing the algorithm. This study will result in a data analysis of the tested parameters to measure the performance of the algorithm. Using test variations: performance at C5.0, C4.5, and k-NN at 0% missing rate, performance at C5.0, C4.5, and k-NN at 5–50% missing rate, performance at C5.0 + k-NNI, C4.5 + k-NNI, and k-NN + k-NNI classifier at 5–50% missing rate, and performance at C5.0 + CMI, C4.5 + CMI, and k-NN + CMI classifier at 5–50% missing rate, The results show that C5.0 with k-NNI produces better classification accuracy than other tested imputation and classification algorithms. For example, with 35% of the dataset missing, this method obtains 93.40% validation accuracy and 92% test accuracy. C5.0 with k-NNI also offers fast processing times compared with other methods.

View Publication Preview PDF

Quick Preview PDF

Publication Date

Mon Aug 01 2016

Journal Name

Journal Of Economics And Administrative Sciences

User (K-Means) for clustering in Data Mining with application

العناصر

تنقيب البيانات

العنقدة

التعليم الالي

الخوارزمية.

object

data mining

clustering

machine learning

algorithm object

data mining

clustering

machine learning

algorithm

قتيبة نبيل

محي الدين خلف

...Show More Authors

The great scientific progress has led to widespread Information as information accumulates in large databases is important in trying to revise and compile this vast amount of data and, where its purpose to extract hidden information or classified data under their relations with each other in order to take advantage of them for technical purposes.

And work with data mining (DM) is appropriate in this area because of the importance of research in the (K-Means) algorithm for clustering data in fact applied with effect can be observed in variables by changing the sample size (n) and the number of clusters (K)

View Publication Preview PDF

Publication Date

Sun Jan 01 2023

Journal Name

Aip Conference Proceedings

Mining categorical Covid-19 data using chi-square and logistic regression algorithms

Asmaa Hasan

Alia Karim

Safa Sami

Asmaa Abdullah

Duaa Jaffer Al

...Show More Authors

View Publication Preview PDF

Publication Date

Fri Feb 01 2019

Journal Name

Iraqi Journal Of Information & Communications Technology

Evaluation of DDoS attacks Detection in a New Intrusion Dataset Based on Classification Algorithms

Amer A.

Mahmood K.

...Show More Authors

Intrusion detection system is an imperative role in increasing security and decreasing the harm of the computer security system and information system when using of network. It observes different events in a network or system to decide occurring an intrusion or not and it is used to make strategic decision, security purposes and analyzing directions. This paper describes host based intrusion detection system architecture for DDoS attack, which intelligently detects the intrusion periodically and dynamically by evaluating the intruder group respective to the present node with its neighbors. We analyze a dependable dataset named CICIDS 2017 that contains benign and DDoS attack network flows, which meets certifiable criteria and is ope

View Publication Preview PDF

(14)

Publication Date

Tue Mar 30 2021

Journal Name

Baghdad Science Journal

Application of Data Mining Techniques on Tourist Expenses in Malaysia

Tourism

Data mining

Classification

JRIP

Random Tree

J48

REP Tree

Miao

Tan Shi

...Show More Authors

Tourism plays an important role in Malaysia’s economic development as it can boost business opportunity in its surrounding economic. By apply data mining on tourism data for predicting the area of business opportunity is a good choice. Data mining is the process that takes data as input and produces outputs knowledge. Due to the population of travelling in Asia country has increased in these few years. Many entrepreneurs start their owns business but there are some problems such as wrongly invest in the business fields and bad services quality which affected their business income. The objective of this paper is to use data mining technology to meet the business needs and customer needs of tourism enterprises and find the most effective

View Publication Preview PDF

(3)

(1)

Publication Date

Sun Jul 31 2022

Journal Name

Iraqi Journal Of Science

A Review of Data Mining and Knowledge Discovery Approaches for Bioinformatics

Bioinformatics

Data Mining

Knowledge Discovery Database

Gene Ontology

Similarity Function

Fatin Kadhim

Suhad Faisal

...Show More Authors

This review explores the Knowledge Discovery Database (KDD) approach, which supports the bioinformatics domain to progress efficiently, and illustrate their relationship with data mining. Thus, it is important to extract advantages of Data Mining (DM) strategy management such as effectively stressing its role in cost control, which is the principle of competitive intelligence, and the role of it in information management. As well as, its ability to discover hidden knowledge. However, there are many challenges such as inaccurate, hand-written data, and analyzing a large amount of variant information for extracting useful knowledge by using DM strategies. These strategies are successfully applied in several applications as data wa

View Publication

(1)

(2)

Publication Date

Wed Aug 01 2018

Journal Name

Journal Of Economics And Administrative Sciences

Comparison Some Estimation Methods Of GM(1,1) Model With Missing Data and Practical Application

GM(1

1)

LS

WLS

TLS

DS

HFO

D.O .

فراس احمد

نور

...Show More Authors

This paper presents a grey model GM(1,1) of the first rank and a variable one and is the basis of the grey system theory , This research dealt properties of grey model and a set of methods to estimate parameters of the grey model GM(1,1) is the least square Method (LS) , weighted least square method (WLS), total least square method (TLS) and gradient descent method (DS). These methods were compared based on two types of standards: Mean square error (MSE), mean absolute percentage error (MAPE), and after comparison using simulation the best method was applied to real data represented by the rate of consumption of the two types of oils a Heavy fuel (HFO) and diesel fuel (D.O) and has been applied several tests to

View Publication Preview PDF

Publication Date

Sat Jan 01 2022

Journal Name

Turkish Journal Of Physiotherapy And Rehabilitation

classification coco dataset using machine learning algorithms

learning Machine

classification

MCOCO dataset

K Nearest Neighbor's

Stochastic Gradient Descent learning (SGD)

Logistic Regression Algorithm(LR)

and Multi-Layer Perceptron (MLP).

Rasool

Bushra

Raaid

...Show More Authors

In this paper, we used four classification methods to classify objects and compareamong these methods, these are K Nearest Neighbor's (KNN), Stochastic Gradient Descentlearning (SGD), Logistic Regression Algorithm(LR), and Multi-Layer Perceptron (MLP). Weused MCOCO dataset for classification and detection the objects, these dataset image wererandomly divided into training and testing datasets at a ratio of 7:3, respectively. In randomlyselect training and testing dataset images, converted the color images to the gray level, thenenhancement these gray images using the histogram equalization method, resize (20 x 20) fordataset image. Principal component analysis (PCA) was used for feature extraction, andfinally apply four classification metho

Publication Date

Sat Jul 31 2021

Journal Name

Iraqi Journal Of Science

A review of Medical Diagnostics Via Data Mining Techniques

Classification

Data mining

Decision tree

Hierarchical clustering

K-means

Sarah Sameer

suhad

Iyden

Mustafa

...Show More Authors

Data mining is one of the most popular analysis methods in medical research. It involves finding patterns and correlations in previously unknown datasets. Data mining encompasses various areas of biomedical research, including data collection, clinical decision support, illness or safety monitoring, public health, and inquiry research. Health analytics frequently uses computational methods for data mining, such as clustering, classification, and regression. Studies of large numbers of diverse heterogeneous documents, including biological and electronic information, provided extensive material to medical and health studies.

View Publication Preview PDF

(1)

Publication Date

Mon Feb 21 2022

Journal Name

Iraqi Journal For Computer Science And Mathematics

Fuzzy C means Based Evaluation Algorithms For Cancer Gene Expression Data Clustering

Omar

Basad

...Show More Authors

The influx of data in bioinformatics is primarily in the form of DNA, RNA, and protein sequences. This condition places a significant burden on scientists and computers. Some genomics studies depend on clustering techniques to group similarly expressed genes into one cluster. Clustering is a type of unsupervised learning that can be used to divide unknown cluster data into clusters. The k-means and fuzzy c-means (FCM) algorithms are examples of algorithms that can be used for clustering. Consequently, clustering is a common approach that divides an input space into several homogeneous zones; it can be achieved using a variety of algorithms. This study used three models to cluster a brain tumor dataset. The first model uses FCM, whic

View Publication

(1)

Publication Date

Fri Sep 30 2022

Journal Name

Journal Of Economics And Administrative Sciences

Semi parametric Estimators for Quantile Model via LASSO and SCAD with Missing Data

Quantile regression

partial linear model

LASSO

SCAD

missing data

nearest neighbor

Aws Adnan

Qutaiba N. Nayef

...Show More Authors

In this study, we made a comparison between LASSO & SCAD methods, which are two special methods for dealing with models in partial quantile regression. (Nadaraya & Watson Kernel) was used to estimate the non-parametric part ;in addition, the rule of thumb method was used to estimate the smoothing bandwidth (h). Penalty methods proved to be efficient in estimating the regression coefficients, but the SCAD method according to the mean squared error criterion (MSE) was the best after estimating the missing data using the mean imputation method

View Publication Preview PDF

1 2 3 4 ... 3276 3277 3278 3279