Data generated from modern applications and the internet in healthcare is extensive and rapidly expanding. Therefore, one of the significant success factors for any application is understanding and extracting meaningful information using digital analytics tools. These tools will positively impact the application's performance and handle the challenges that can be faced to create highly consistent, logical, and information-rich summaries. This paper contains three main objectives: First, it provides several analytics methodologies that help to analyze datasets and extract useful information from them as preprocessing steps in any classification model to determine the dataset characteristics. Also, this paper provides a comparative study of several classification algorithms by testing 12 different classifiers using two international datasets to provide an accurate indicator of their efficiency and the future possibility of combining efficient algorithms to achieve better results. Finally, building several CBC datasets for the first time in Iraq helps to detect blood diseases from different hospitals. The outcome of the analysis step is used to help researchers to select the best system structure according to the characteristics of each dataset for more organized and thorough results. Also, according to the test results, four algorithms achieved the best accuracy (Logitboost, Random Forest, XGBoost, Multilayer Perceptron). Then use the Logitboost algorithm that achieved the best accuracy to classify these new datasets. In addition, as future directions, this paper helps to investigate the possibility of combining the algorithms to utilize benefits and overcome their disadvantages.
Support vector machine (SVM) is a popular supervised learning algorithm based on margin maximization. It has a high training cost and does not scale well to a large number of data points. We propose a multiresolution algorithm MRH-SVM that trains SVM on a hierarchical data aggregation structure, which also serves as a common data input to other learning algorithms. The proposed algorithm learns SVM models using high-level data aggregates and only visits data aggregates at more detailed levels where support vectors reside. In addition to performance improvements, the algorithm has advantages such as the ability to handle data streams and datasets with imbalanced classes. Experimental results show significant performance improvements in compa
... Show MoreIn general, path-planning problem is one of most important task in the field of robotics. This paper describes the path-planning problem of mobile robot based on various metaheuristic algorithms. The suitable collision free path of a robot must satisfies certain optimization criteria such as feasibility, minimum path length, safety and smoothness and so on. In this research, various three approaches namely, PSO, Firefly and proposed hybrid FFCPSO are applied in static, known environment to solve the global path-planning problem in three cases. The first case used single mobile robot, the second case used three independent mobile robots and the third case applied three follow up mobile robot. Simulation results, whi
... Show MoreBig data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such a
... Show MoreIn this paper, we used four classification methods to classify objects and compareamong these methods, these are K Nearest Neighbor's (KNN), Stochastic Gradient Descentlearning (SGD), Logistic Regression Algorithm(LR), and Multi-Layer Perceptron (MLP). Weused MCOCO dataset for classification and detection the objects, these dataset image wererandomly divided into training and testing datasets at a ratio of 7:3, respectively. In randomlyselect training and testing dataset images, converted the color images to the gray level, thenenhancement these gray images using the histogram equalization method, resize (20 x 20) fordataset image. Principal component analysis (PCA) was used for feature extraction, andfinally apply four classification metho
... Show MoreIn this paper, a cognitive system based on a nonlinear neural controller and intelligent algorithm that will guide an autonomous mobile robot during continuous path-tracking and navigate over solid obstacles with avoidance was proposed. The goal of the proposed structure is to plan and track the reference path equation for the autonomous mobile robot in the mining environment to avoid the obstacles and reach to the target position by using intelligent optimization algorithms. Particle Swarm Optimization (PSO) and Artificial Bee Colony (ABC) Algorithms are used to finding the solutions of the mobile robot navigation problems in the mine by searching the optimal paths and finding the reference path equation of the optimal
... Show MoreRecurrent strokes can be devastating, often resulting in severe disability or death. However, nearly 90% of the causes of recurrent stroke are modifiable, which means recurrent strokes can be averted by controlling risk factors, which are mainly behavioral and metabolic in nature. Thus, it shows that from the previous works that recurrent stroke prediction model could help in minimizing the possibility of getting recurrent stroke. Previous works have shown promising results in predicting first-time stroke cases with machine learning approaches. However, there are limited works on recurrent stroke prediction using machine learning methods. Hence, this work is proposed to perform an empirical analysis and to investigate machine learning al
... Show MoreFeature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematicall
... Show More<span lang="EN-US">Diabetes is one of the deadliest diseases in the world that can lead to stroke, blindness, organ failure, and amputation of lower limbs. Researches state that diabetes can be controlled if it is detected at an early stage. Scientists are becoming more interested in classification algorithms in diagnosing diseases. In this study, we have analyzed the performance of five classification algorithms namely naïve Bayes, support vector machine, multi layer perceptron artificial neural network, decision tree, and random forest using diabetes dataset that contains the information of 2000 female patients. Various metrics were applied in evaluating the performance of the classifiers such as precision, area under the c
... Show MoreText categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy th
... Show More