Support vector machines (SVMs) are supervised learning models that analyze data for classification or regression. For classification, SVM is widely used by selecting an optimal hyperplane that separates two classes. SVM has very good accuracy and extremally robust comparing with some other classification methods such as logistics linear regression, random forest, k-nearest neighbor and naïve model. However, working with large datasets can cause many problems such as time-consuming and inefficient results. In this paper, the SVM has been modified by using a stochastic Gradient descent process. The modified method, stochastic gradient descent SVM (SGD-SVM), checked by using two simulation datasets. Since the classification of different cancer types is important for cancer diagnosis and drug discovery, SGD-SVM is applied for classifying the most common leukemia cancer type dataset. The results that are gotten using SGD-SVM are much accurate than other results of many studies that used the same leukemia datasets.
The Machine learning methods, which are one of the most important branches of promising artificial intelligence, have great importance in all sciences such as engineering, medical, and also recently involved widely in statistical sciences and its various branches, including analysis of survival, as it can be considered a new branch used to estimate the survival and was parallel with parametric, nonparametric and semi-parametric methods that are widely used to estimate survival in statistical research. In this paper, the estimate of survival based on medical images of patients with breast cancer who receive their treatment in Iraqi hospitals was discussed. Three algorithms for feature extraction were explained: The first principal compone
... Show MoreVariable selection is an essential and necessary task in the statistical modeling field. Several studies have triedto develop and standardize the process of variable selection, but it isdifficultto do so. The first question a researcher needs to ask himself/herself what are the most significant variables that should be used to describe a given dataset’s response. In thispaper, a new method for variable selection using Gibbs sampler techniqueshas beendeveloped.First, the model is defined, and the posterior distributions for all the parameters are derived.The new variable selection methodis tested usingfour simulation datasets. The new approachiscompared with some existingtechniques: Ordinary Least Squared (OLS), Least Absolute Shrinkage
... Show MoreSome of the main challenges in developing an effective network-based intrusion detection system (IDS) include analyzing large network traffic volumes and realizing the decision boundaries between normal and abnormal behaviors. Deploying feature selection together with efficient classifiers in the detection system can overcome these problems. Feature selection finds the most relevant features, thus reduces the dimensionality and complexity to analyze the network traffic. Moreover, using the most relevant features to build the predictive model, reduces the complexity of the developed model, thus reducing the building classifier model time and consequently improves the detection performance. In this study, two different sets of select
... Show MoreThe support vector machine, also known as SVM, is a type of supervised learning model that can be used for classification or regression depending on the datasets. SVM is used to classify data points by determining the best hyperplane between two or more groups. Working with enormous datasets, on the other hand, might result in a variety of issues, including inefficient accuracy and time-consuming. SVM was updated in this research by applying some non-linear kernel transformations, which are: linear, polynomial, radial basis, and multi-layer kernels. The non-linear SVM classification model was illustrated and summarized in an algorithm using kernel tricks. The proposed method was examined using three simulation datasets with different sample
... Show MoreObjective(s): To determine the impact of psychological distress in women upon coping with breast cancer.
Methodology: A descriptive design is carried throughout the present study. Convenient sample of (60) woman with breast cancer is recruited from the community. Two instruments, psychological distress scale and coping scale are developed for the study. Internal consistency reliability and content validity are obtained for the study instruments. Data are collect through the application of the study instruments. Data are analyzed through the use of descriptive statistical data analysis approach and inferential statistical data analysis approach.
Results: The study findings depict that women with breast cancer have experien
... Show MoreThe proposal of nonlinear models is one of the most important methods in time series analysis, which has a wide potential for predicting various phenomena, including physical, engineering and economic, by studying the characteristics of random disturbances in order to arrive at accurate predictions.
In this, the autoregressive model with exogenous variable was built using a threshold as the first method, using two proposed approaches that were used to determine the best cutting point of [the predictability forward (forecasting) and the predictability in the time series (prediction), through the threshold point indicator]. B-J seasonal models are used as a second method based on the principle of the two proposed approaches in dete
... Show MoreWhen optimizing the performance of neural network-based chatbots, determining the optimizer is one of the most important aspects. Optimizers primarily control the adjustment of model parameters such as weight and bias to minimize a loss function during training. Adaptive optimizers such as ADAM have become a standard choice and are widely used for their invariant parameter updates' magnitudes concerning gradient scale variations, but often pose generalization problems. Alternatively, Stochastic Gradient Descent (SGD) with Momentum and the extension of ADAM, the ADAMW, offers several advantages. This study aims to compare and examine the effects of these optimizers on the chatbot CST dataset. The effectiveness of each optimizer is evaluat
... Show MoreThe availability of different processing levels for satellite images makes it important to measure their suitability for classification tasks. This study investigates the impact of the Landsat data processing level on the accuracy of land cover classification using a support vector machine (SVM) classifier. The classification accuracy values of Landsat 8 (LS8) and Landsat 9 (LS9) data at different processing levels vary notably. For LS9, Collection 2 Level 2 (C2L2) achieved the highest accuracy of (86.55%) with the polynomial kernel of the SVM classifier, surpassing the Fast Line-of-Sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) at (85.31%) and Collection 2 Level 1 (C2L1) at (84.93%). The LS8 data exhibits similar behavior. Conv
... Show MoreEach project management system aims to complete the project within its identified objectives: budget, time, and quality. It is achieving the project within the defined deadline that required careful scheduling, that be attained early. Due to the nature of unique repetitive construction projects, time contingency and project uncertainty are necessary for accurate scheduling. It should be integrated and flexible to accommodate the changes without adversely affecting the construction project’s total completion time. Repetitive planning and scheduling methods are more effective and essential. However, they need continuous development because of the evolution of execution methods, essent