XGBOOST AND COST-SENSITIVE CART FOR
IMBALANCED MULTICLASS DIABETES
CLASSIFICATION IN IRAQ

Nabila A. Alsharif Alsharif; Inaam Aboud Hussain Hussain; Loaiy F. Naji Naji

Details

Publication Date

Tue Feb 03 2026

Journal Name

Journal Of Mechanics Of Continua And Mathematical Sciences

Volume

21

Issue Number

2

Choose Citation Style

Statistics

View publication

5

Statistics

XGBOOST AND COST-SENSITIVE CART FOR IMBALANCED MULTICLASS DIABETES CLASSIFICATION IN IRAQ

Classification

XGBoost

CART

Class imbalance

Diabetes

Pre diabetic

Nabila A. Alsharif Alsharif

Inaam Aboud Hussain Hussain

Loaiy F. Naji Naji

...Show More Authors

Diabetes imposes a substantial public health burden; according to the International Diabetes Federation, there were about 3.4 million diabetes related deaths worldwide in 2024, and in Iraq, the Federation reports that one in nine adults lives with diabetes in 2024, with 14,683 adult deaths attributable to diabetes and a total diabetes related health expenditure of 2,078 million United States dollars. The dataset analyzed in this study contains 1,000 records collected in 2020 from two Iraqi teaching hospitals and includes multiple clinical and laboratory measurements with three outcome classes, namely Non diabetic, Pre diabetic, and Diabetic, with a low prevalence of the Pre diabetic class and an imbalanced overall class distribution; the data are challenging because they contain many outliers, non homogeneous covariance matrices across classes, exact duplicate rows that were removed before modelling, and linear correlations among certain variables. The study objective was to train and evaluate models that discriminate among the three classes and yield accurate, well calibrated predictions for future cases in similar clinical settings, but the diagnostic properties of the data limited the applicability of classical discriminant functions; therefore two supervised learners were employed: Classification and Regression Trees (CART) and Extreme Gradient Boosting (XGBoost), together with preprocessing that removed exact duplicate rows and excluded VLDL because it is algebraically derived from triglycerides in mmol per liter as VLDL equals triglycerides divided by 2.2, which would introduce redundancy and multicollinearity. On the heldout test set, XGBoost achieved higher Accuracy at 98.18 percent compared with 97.58 percent for CART and higher Balanced Accuracy at 93.84 percent compared with 88.16 percent for CART, indicating that XGBoost provided the strongest overall operating point for this three-class task while CART remains useful when simple and transparent rules are required.

Preview PDF

Quick Preview PDF

Publication Date

Sun Jan 10 2016

Journal Name

British Journal Of Applied Science & Technology

The Effect of Classification Methods on Facial Emotion Recognition ‎Accuracy

Facial emotions

feature selection

data clustering

modified K-Means clustering algorithm

LDA algorithm

Statistical classifier

Neural Network

Support Vector Machine (SVM)

Suhaila N.

...Show More Authors

The interests toward developing accurate automatic face emotion recognition methodologies are growing vastly, and it is still one of an ever growing research field in the region of computer vision, artificial intelligent and automation. However, there is a challenge to build an automated system which equals human ability to recognize facial emotion because of the lack of an effective facial feature descriptor and the difficulty of choosing proper classification method. In this paper, a geometric based feature vector has been proposed. For the classification purpose, three different types of classification methods are tested: statistical, artificial neural network (NN) and Support Vector Machine (SVM). A modified K-Means clustering algorithm

View Publication Preview PDF

(2)

Publication Date

Mon Dec 01 2014

Journal Name

Journal Of Economics And Administrative Sciences

Comparison between some of linear classification models with practical application

Linear discriminant analysis

binary response logistic regression and misclassification probability.

حمزة اسماعيل

...Show More Authors

Linear discriminant analysis and logistic regression are the most widely used in multivariate statistical methods for analysis of data with categorical outcome variables .Both of them are appropriate for the development of linear classification models .linear discriminant analysis has been that the data of explanatory variables must be distributed multivariate normal distribution. While logistic regression no assumptions on the distribution of the explanatory data. Hence ,It is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions.

In this paper we have been focus for the comparison between three forms for classification data belongs

View Publication Preview PDF

(1)

Publication Date

Sat Dec 01 2018

Journal Name

Journal Of Accounting And Financial Studies ( Jafs )

Use ofBang marking in the management of the cost of food and beverages in the hotel: sector (Case study in a sample of hotels in Baghdad governorate)

:- ادارة الكلفة

المقارنة المرجعية

كلفة الاطعمة في الفنادق

كلفة المشروبات في الفنادق

د . ندى سلمان

...Show More Authors

The competition in the hotel sector, globalization and the development of new information have forced the sector to continuously seek new techniques and arrangements to remain competitive through hotel industry companies, including Benchmarking and the application of this method in the hotel sector. The selection of the Rashid International Hotel by the Ministry of Tourism as a leading hotel or benchmark for comparison of other hotels in Iraq, and the selection of two hotels in Baghdad for comparison, namely (Ishtar International Hotel, Baghdad International Hotel) and the choice also by the Ministry of Tourism, N is to correct the course of practice to manage the cost and diagnosis of the weakness of the strengths and weaknesses in the

View Publication

Publication Date

Sun Apr 04 2010

Journal Name

Journal Of Educational And Psychological Researches

Translation & Adaptation of(Patterns) & (Assembly) Scales of The Flanagan Aptitude Classification Tests (FACT)

Translation & Adaptation

The Flanagan Aptitude Classification Tests (FACT)

Adil A. S. Al-Salihy

Huda Jameel Abdul-Ghani

...Show More Authors

The Flanagan Aptitude Classification Tests (FACT) assesses aptitudes that are important for successful performance of particular job-related tasks. An individual's aptitude can then be matched to the job tasks. The FACT helps to determine the tasks in which a person has proficiency. Each test measures a specific skill that is important for particular occupations. The FACT battery is designed to provide measures of an individual's aptitude for each of 16 job elements.

The FACT consists of 16 tests used to measure aptitudes that are important for the successful performance of many occupational tasks. The tests provide a broad basis for predicting success in various occupational fields. All are paper and pen

View Publication Preview PDF

Publication Date

Thu Dec 03 2015

Journal Name

Iraqi Journal Of Science

New multispectral images classification method based on MSR and Skewness implementing on various sensor scenes

Taghreed

...Show More Authors

Publication Date

Tue Jan 01 2013

Journal Name

Ibn Al-haitham Journal For Pure And Applied Science

Classification and Construction of (k,3)-Arcs on Projective Plane Over Galois Field GF(7)

A.

Fatema Faisal

...Show More Authors

The purpose of this work is to study the classification and construction of (k,3)-arcs in the projective plane PG(2,7). We found that there are two (5,3)-arcs, four (6,3)-arcs, six (7,3)arcs, six (8,3)-arcs, seven (9,3)-arcs, six (10,3)-arcs and six (11,3)-arcs. All of these arcs are incomplete. The number of distinct (12,3)-arcs are six, two of them are complete. There are four distinct (13,3)-arcs, two of them are complete and one (14,3)-arc which is incomplete. There exists one complete (15,3)-arc.

Publication Date

Fri Jan 31 2025

Journal Name

Aip Conference Proceedings

Classification of oral cavity cancer using linear discriminant analysis (LDA) and principal component analysis (PCA)

Mohammed Fouad

Ahmed F.

Yasser Y.

...Show More Authors

View Publication

(2)

Publication Date

Sat Jun 01 2019

Journal Name

Periodicals Of Engineering And Natural Sciences (pen)

Lung cancer classification using data mining and supervised learning algorithms on multi-dimensional data set

Saadaldeen

Israa

ammar

Haider

...Show More Authors

These With recent developments in machine learning, data mining and computer vision, there is great potential for improvements in early detection of lung cancer using scans and data available. This paper details the methods and techniques used in our project, where the objective is to develop algorithms to determine whether a patient has or is likely to develop lung cancer using dataset images using data mining and machine learning for the classification and examination. We explore approaches to address the problem. Cancer is the most important cause of death globally. The disease diagnosis is a major process to treat the patients who are affected by cancer disease. The diagnosis process is more difficult comparatively known about t

View Publication Preview PDF

(49)

Publication Date

Sun Jul 01 2018

Journal Name

Journal Of The American Pharmacists Association

Evaluation of community pharmacist–provided telephone interventions to improve adherence to hypertension and diabetes medications

Ali Azeez

...Show More Authors

View Publication

(31)

(26)

Publication Date

Sat Dec 30 2023

Journal Name

Journal Of Economics And Administrative Sciences

Classification of Iraqi Children According to Their Nutritional Status Using Fuzzy Logic

المنطق الضبابي

التصنيف الضبابي

الحالة التغذوية

طريقة مامديني

إزالة التضبيب.

Fuzzy Logic

Fuzzy Classification

Nutritional Status

Mamdani Method

Defuzzification

Hussein

Mohammad

...Show More Authors

In this paper, we build a fuzzy classification system for classifying the nutritional status of children under 5 years old in Iraq using the Mamdani method based on input variables such as weight and height to determine the nutritional status of the child. Also, Classifying the nutritional status faces a difficult challenge in the medical field due to uncertainty and ambiguity in the variables and attributes that determine the categories of nutritional status for children, which are relied upon in medical diagnosis to determine the types of malnutrition problems and identify the categories or groups suffering from malnutrition to determine the risks faced by each group or category of children. Malnutrition in children is one of the most

View Publication Preview PDF

(1)

1 2 ... 55 56 57 58 ... 2596 2597