For many years, reading rate as word correct per minute (WCPM) has been investigated by many researchers as an indicator of learners’ level of oral reading speed, accuracy, and comprehension. The aim of the study is to predict the levels of WCPM using three machine learning algorithms which are Ensemble Classifier (EC), Decision Tree (DT), and K- Nearest Neighbor (KNN). The data of this study were collected from 100 Kurdish EFL students in the 2nd-year, English language department, at the University of Duhok in 2021. The outcomes showed that the ensemble classifier (EC) obtained the highest accuracy of testing results with a value of 94%. Also, EC recorded the highest precision, recall, and F1 scores with values of 0.92 for the three performance measures. The Receiver Operating Character curve (ROC curve) also got the highest results than other classification algorithms. Accordingly, it can be concluded that the ensemble classifier is the best and most accurate model for predicting reading rate (accuracy) WCPM.
Data mining has the most important role in healthcare for discovering hidden relationships in big datasets, especially in breast cancer diagnostics, which is the most popular cause of death in the world. In this paper two algorithms are applied that are decision tree and K-Nearest Neighbour for diagnosing Breast Cancer Grad in order to reduce its risk on patients. In decision tree with feature selection, the Gini index gives an accuracy of %87.83, while with entropy, the feature selection gives an accuracy of %86.77. In both cases, Age appeared as the most effective parameter, particularly when Age<49.5. Whereas Ki67 appeared as a second effective parameter. Furthermore, K- Nearest Neighbor is based on the minimum err
... Show MoreData mining has the most important role in healthcare for discovering hidden relationships in big datasets, especially in breast cancer diagnostics, which is the most popular cause of death in the world. In this paper two algorithms are applied that are decision tree and K-Nearest Neighbour for diagnosing Breast Cancer Grad in order to reduce its risk on patients. In decision tree with feature selection, the Gini index gives an accuracy of %87.83, while with entropy, the feature selection gives an accuracy of %86.77. In both cases, Age appeared as the most effective parameter, particularly when Age<49.5. Whereas Ki67 appeared as a second effective parameter. Furthermore, K- Nearest Neighbor is based on the minimu
... Show MoreThis paper proposed a new method to study functional non-parametric regression data analysis with conditional expectation in the case that the covariates are functional and the Principal Component Analysis was utilized to de-correlate the multivariate response variables. It utilized the formula of the Nadaraya Watson estimator (K-Nearest Neighbour (KNN)) for prediction with different types of the semi-metrics, (which are based on Second Derivative and Functional Principal Component Analysis (FPCA)) for measureing the closeness between curves. Root Mean Square Errors is used for the implementation of this model which is then compared to the independent response method. R program is used for analysing data. Then, when the cov
... Show MoreHeart disease is a non-communicable disease and the number 1 cause of death in Indonesia. According to WHO predictions, heart disease will cause 11 million deaths in 2020. Bad lifestyle and unhealthy consumption patterns of modern society are the causes of this disease experienced by many people. Lack of knowledge about heart conditions and the potential dangers cause heart disease attacks before any preventive measures are taken. This study aims to produce a system for Predicting Heart Disease, which benefits to prevent and reduce the number of deaths caused by heart disease. The use of technology in the health sector has been widely practiced in various places and one of the advanced technologies is machine lea
... Show MoreAnemia is one of the common types of blood diseases, it lead to lack of number of RBCs (Red Blood Cell) and amount hemoglobin level in the blood is lower than normal.
In this paper a new algorithm is presented to recognize Anemia in digital images based on moment variant. The algorithm is accomplished using the following phases: preprocessing, segmentation, feature extraction and classification (using Decision Tree), the extracted features that are used for classification are Moment Invariant and Geometric Feature.
The Best obtained classification rates was 84% is obtained when using Moment Invariants features and 74 % is obtained when using Geometric Feature. Results indicate that the proposed algorithm is very effective in detect
There is a great deal of systems dealing with image processing that are being used and developed on a daily basis. Those systems need the deployment of some basic operations such as detecting the Regions of Interest and matching those regions, in addition to the description of their properties. Those operations play a significant role in decision making which is necessary for the next operations depending on the assigned task. In order to accomplish those tasks, various algorithms have been introduced throughout years. One of the most popular algorithms is the Scale Invariant Feature Transform (SIFT). The efficiency of this algorithm is its performance in the process of detection and property description, and that is due to the fact that
... Show MoreE-mail is an efficient and reliable data exchange service. Spams are undesired e-mail messages which are randomly sent in bulk usually for commercial aims. Obfuscated image spamming is one of the new tricks to bypass text-based and Optical Character Recognition (OCR)-based spam filters. Image spam detection based on image visual features has the advantage of efficiency in terms of reducing the computational cost and improving the performance. In this paper, an image spam detection schema is presented. Suitable image processing techniques were used to capture the image features that can differentiate spam images from non-spam ones. Weighted k-nearest neighbor, which is a simple, yet powerful, machine learning algorithm, was used as a clas
... Show MoreThe Internet of Things (IoT) is a network of devices used for interconnection and data transfer. There is a dramatic increase in IoT attacks due to the lack of security mechanisms. The security mechanisms can be enhanced through the analysis and classification of these attacks. The multi-class classification of IoT botnet attacks (IBA) applied here uses a high-dimensional data set. The high-dimensional data set is a challenge in the classification process due to the requirements of a high number of computational resources. Dimensionality reduction (DR) discards irrelevant information while retaining the imperative bits from this high-dimensional data set. The DR technique proposed here is a classifier-based fe
... Show MoreIn this paper, the botnet detection problem is defined as a feature selection problem and the genetic algorithm (GA) is used to search for the best significant combination of features from the entire search space of set of features. Furthermore, the Decision Tree (DT) classifier is used as an objective function to direct the ability of the proposed GA to locate the combination of features that can correctly classify the activities into normal traffics and botnet attacks. Two datasets namely the UNSW-NB15 and the Canadian Institute for Cybersecurity Intrusion Detection System 2017 (CICIDS2017), are used as evaluation datasets. The results reveal that the proposed DT-aware GA can effectively find the relevant features from
... Show MoreIn this paper, the botnet detection problem is defined as a feature selection problem and the genetic algorithm (GA) is used to search for the best significant combination of features from the entire search space of set of features. Furthermore, the Decision Tree (DT) classifier is used as an objective function to direct the ability of the proposed GA to locate the combination of features that can correctly classify the activities into normal traffics and botnet attacks. Two datasets namely the UNSW-NB15 and the Canadian Institute for Cybersecurity Intrusion Detection System 2017 (CICIDS2017), are used as evaluation datasets. The results reveal that the proposed DT-aware GA can effectively find the relevant
... Show More