Preferred Language
Articles
/
YuYScpwBmraWrQ4dVEo7
XGBOOST AND COST-SENSITIVE CART FOR IMBALANCED MULTICLASS DIABETES CLASSIFICATION IN IRAQ
...Show More Authors

Diabetes imposes a substantial public health burden; according to the International Diabetes Federation, there were about 3.4 million diabetes related deaths worldwide in 2024, and in Iraq, the Federation reports that one in nine adults lives with diabetes in 2024, with 14,683 adult deaths attributable to diabetes and a total diabetes related health expenditure of 2,078 million United States dollars. The dataset analyzed in this study contains 1,000 records collected in 2020 from two Iraqi teaching hospitals and includes multiple clinical and laboratory measurements with three outcome classes, namely Non diabetic, Pre diabetic, and Diabetic, with a low prevalence of the Pre diabetic class and an imbalanced overall class distribution; the data are challenging because they contain many outliers, non homogeneous covariance matrices across classes, exact duplicate rows that were removed before modelling, and linear correlations among certain variables. The study objective was to train and evaluate models that discriminate among the three classes and yield accurate, well calibrated predictions for future cases in similar clinical settings, but the diagnostic properties of the data limited the applicability of classical discriminant functions; therefore two supervised learners were employed: Classification and Regression Trees (CART) and Extreme Gradient Boosting (XGBoost), together with preprocessing that removed exact duplicate rows and excluded VLDL because it is algebraically derived from triglycerides in mmol per liter as VLDL equals triglycerides divided by 2.2, which would introduce redundancy and multicollinearity. On the heldout test set, XGBoost achieved higher Accuracy at 98.18 percent compared with 97.58 percent for CART and higher Balanced Accuracy at 93.84 percent compared with 88.16 percent for CART, indicating that XGBoost provided the strongest overall operating point for this three-class task while CART remains useful when simple and transparent rules are required.

Preview PDF
Quick Preview PDF
Publication Date
Thu Sep 15 2022
Journal Name
Knowledge And Information Systems
Multiresolution hierarchical support vector machine for classification of large datasets
...Show More Authors

Support vector machine (SVM) is a popular supervised learning algorithm based on margin maximization. It has a high training cost and does not scale well to a large number of data points. We propose a multiresolution algorithm MRH-SVM that trains SVM on a hierarchical data aggregation structure, which also serves as a common data input to other learning algorithms. The proposed algorithm learns SVM models using high-level data aggregates and only visits data aggregates at more detailed levels where support vectors reside. In addition to performance improvements, the algorithm has advantages such as the ability to handle data streams and datasets with imbalanced classes. Experimental results show significant performance improvements in compa

... Show More
View Publication
Scopus (6)
Crossref (4)
Scopus Clarivate Crossref
Publication Date
Sat Jan 19 2019
Journal Name
Artificial Intelligence Review
Survey on supervised machine learning techniques for automatic text classification
...Show More Authors

View Publication
Scopus (343)
Crossref (309)
Scopus Clarivate Crossref
Publication Date
Wed Sep 23 2020
Journal Name
Artificial Intelligence Research
Hybrid approaches to feature subset selection for data classification in high-dimensional feature space
...Show More Authors

This paper proposes two hybrid feature subset selection approaches based on the combination (union or intersection) of both supervised and unsupervised filter approaches before using a wrapper, aiming to obtain low-dimensional features with high accuracy and interpretability and low time consumption. Experiments with the proposed hybrid approaches have been conducted on seven high-dimensional feature datasets. The classifiers adopted are support vector machine (SVM), linear discriminant analysis (LDA), and K-nearest neighbour (KNN). Experimental results have demonstrated the advantages and usefulness of the proposed methods in feature subset selection in high-dimensional space in terms of the number of selected features and time spe

... Show More
View Publication
Crossref
Publication Date
Tue Jan 12 2016
Journal Name
Wireless Networks
Low communication cost (LCC) scheme for localizing mobile wireless sensor networks
...Show More Authors

In recent years, the number of applications utilizing mobile wireless sensor networks (WSNs) has increased, with the intent of localization for the purposes of monitoring and obtaining data from hazardous areas. Location of the event is very critical in WSN, as sensing data is almost meaningless without the location information. In this paper, two Monte Carlo based localization schemes termed MCL and MSL* are studied. MCL obtains its location through anchor nodes whereas MSL* uses both anchor nodes and normal nodes. The use of normal nodes would increase accuracy and reduce dependency on anchor nodes, but increases communication costs. For this reason, we introduce a new approach called low communication cost schemes to reduce communication

... Show More
View Publication
Scopus (34)
Crossref (27)
Scopus Clarivate Crossref
Publication Date
Sat Jun 01 2024
Journal Name
Alexandria Engineering Journal
U-Net for genomic sequencing: A novel approach to DNA sequence classification
...Show More Authors

The precise classification of DNA sequences is pivotal in genomics, holding significant implications for personalized medicine. The stakes are particularly high when classifying key genetic markers such as BRAC, related to breast cancer susceptibility; BRAF, associated with various malignancies; and KRAS, a recognized oncogene. Conventional machine learning techniques often necessitate intricate feature engineering and may not capture the full spectrum of sequence dependencies. To ameliorate these limitations, this study employs an adapted UNet architecture, originally designed for biomedical image segmentation, to classify DNA sequences.The attention mechanism was also tested LONG WITH u-Net architecture to precisely classify DNA sequences

... Show More
View Publication Preview PDF
Scopus (3)
Crossref (3)
Scopus Clarivate Crossref
Publication Date
Sat Jan 01 2022
Journal Name
Archives Of Civil Engineeringthis Link Is Disabled
Factors affecting time and cost trade-off in multiple construction projects
...Show More Authors

Scopus (8)
Scopus
Publication Date
Sat Jan 01 2022
Journal Name
Ieee Access
Wrapper and Hybrid Feature Selection Methods Using Metaheuristic Algorithms for English Text Classification: A Systematic Review
...Show More Authors

Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematicall

... Show More
View Publication Preview PDF
Scopus (71)
Crossref (58)
Scopus Clarivate Crossref
Publication Date
Fri Jul 29 2022
Journal Name
Journal For Vascular Ultrasound
A Comparative Study of the Right and Left Carotid Arteries in Relation to Age for Patients With Diabetes and Hypertension
...Show More Authors
Introduction:

Age, hypertension, and diabetes can cause significant alterations in arterial structure and function, including changes in lumen diameter (LD), intimal-medial thickness (IMT), flow velocities, and arterial compliance. These are also considered risk markers of atherosclerosis and cerebrovascular disease. A difference between right and left carotid artery blood flow and IMT has been reported by some researchers, and a difference in the incidence of nonlacunar stroke has been reported between the right and left brain hemispheres. The aim of this study was to determine whether there are differences between the right and left common carotid arteries and internal carotid arteries in patient

... Show More
View Publication
Scopus (7)
Crossref (5)
Scopus Crossref
Publication Date
Mon Jun 01 2026
Journal Name
Iraqi Journal For Computers And Informatics
Explainable Federated Learning for Brain Tumor Classification Using Multi-Source MRI Data
...Show More Authors

Early diagnosis and clinical decision-making depend on accurate brain tumor classification using magnetic resonance imaging (MRI). However, traditional deep learning methods usually rely on centralized medical data, which raises privacy concerns and limits the use of distributed clinical data. This research proposes a privacy-preserving federated learning framework for MRI image-based binary brain tumor classification using a decentralized ResNet-18 architecture that enables collaborative training without sharing raw patient data. To reflect realistic clinical conditions, the framework integrates heterogeneous multi-source datasets in different image formats (PNG and JPG) and evaluates performance under both IID and non-IID settings

... Show More
View Publication Preview PDF
Crossref
Publication Date
Thu Sep 01 2016
Journal Name
2016 8th Computer Science And Electronic Engineering (ceec)
Class-specific pre-trained sparse autoencoders for learning effective features for document classification
...Show More Authors

View Publication
Scopus (6)
Crossref (2)
Scopus Crossref