Support vector machines (SVMs) are supervised learning models that analyze data for classification or regression. For classification, SVM is widely used by selecting an optimal hyperplane that separates two classes. SVM has very good accuracy and extremally robust comparing with some other classification methods such as logistics linear regression, random forest, k-nearest neighbor and naïve model. However, working with large datasets can cause many problems such as time-consuming and inefficient results. In this paper, the SVM has been modified by using a stochastic Gradient descent process. The modified method, stochastic gradient descent SVM (SGD-SVM), checked by using two simulation datasets. Since the classification of different cancer types is important for cancer diagnosis and drug discovery, SGD-SVM is applied for classifying the most common leukemia cancer type dataset. The results that are gotten using SGD-SVM are much accurate than other results of many studies that used the same leukemia datasets.
The Machine learning methods, which are one of the most important branches of promising artificial intelligence, have great importance in all sciences such as engineering, medical, and also recently involved widely in statistical sciences and its various branches, including analysis of survival, as it can be considered a new branch used to estimate the survival and was parallel with parametric, nonparametric and semi-parametric methods that are widely used to estimate survival in statistical research. In this paper, the estimate of survival based on medical images of patients with breast cancer who receive their treatment in Iraqi hospitals was discussed. Three algorithms for feature extraction were explained: The first principal compone
... Show MoreVariable selection is an essential and necessary task in the statistical modeling field. Several studies have triedto develop and standardize the process of variable selection, but it isdifficultto do so. The first question a researcher needs to ask himself/herself what are the most significant variables that should be used to describe a given dataset’s response. In thispaper, a new method for variable selection using Gibbs sampler techniqueshas beendeveloped.First, the model is defined, and the posterior distributions for all the parameters are derived.The new variable selection methodis tested usingfour simulation datasets. The new approachiscompared with some existingtechniques: Ordinary Least Squared (OLS), Least Absolute Shrinkage
... Show MoreSome of the main challenges in developing an effective network-based intrusion detection system (IDS) include analyzing large network traffic volumes and realizing the decision boundaries between normal and abnormal behaviors. Deploying feature selection together with efficient classifiers in the detection system can overcome these problems. Feature selection finds the most relevant features, thus reduces the dimensionality and complexity to analyze the network traffic. Moreover, using the most relevant features to build the predictive model, reduces the complexity of the developed model, thus reducing the building classifier model time and consequently improves the detection performance. In this study, two different sets of select
... Show MoreThe support vector machine, also known as SVM, is a type of supervised learning model that can be used for classification or regression depending on the datasets. SVM is used to classify data points by determining the best hyperplane between two or more groups. Working with enormous datasets, on the other hand, might result in a variety of issues, including inefficient accuracy and time-consuming. SVM was updated in this research by applying some non-linear kernel transformations, which are: linear, polynomial, radial basis, and multi-layer kernels. The non-linear SVM classification model was illustrated and summarized in an algorithm using kernel tricks. The proposed method was examined using three simulation datasets with different sample
... Show MoreThe proposal of nonlinear models is one of the most important methods in time series analysis, which has a wide potential for predicting various phenomena, including physical, engineering and economic, by studying the characteristics of random disturbances in order to arrive at accurate predictions.
In this, the autoregressive model with exogenous variable was built using a threshold as the first method, using two proposed approaches that were used to determine the best cutting point of [the predictability forward (forecasting) and the predictability in the time series (prediction), through the threshold point indicator]. B-J seasonal models are used as a second method based on the principle of the two proposed approaches in dete
... Show MoreObjective(s): To determine the impact of psychological distress in women upon coping with breast cancer.
Methodology: A descriptive design is carried throughout the present study. Convenient sample of (60) woman with breast cancer is recruited from the community. Two instruments, psychological distress scale and coping scale are developed for the study. Internal consistency reliability and content validity are obtained for the study instruments. Data are collect through the application of the study instruments. Data are analyzed through the use of descriptive statistical data analysis approach and inferential statistical data analysis approach.
Results: The study findings depict that women with breast cancer have experien
... Show MoreWhen optimizing the performance of neural network-based chatbots, determining the optimizer is one of the most important aspects. Optimizers primarily control the adjustment of model parameters such as weight and bias to minimize a loss function during training. Adaptive optimizers such as ADAM have become a standard choice and are widely used for their invariant parameter updates' magnitudes concerning gradient scale variations, but often pose generalization problems. Alternatively, Stochastic Gradient Descent (SGD) with Momentum and the extension of ADAM, the ADAMW, offers several advantages. This study aims to compare and examine the effects of these optimizers on the chatbot CST dataset. The effectiveness of each optimizer is evaluat
... Show MoreEach project management system aims to complete the project within its identified objectives: budget, time, and quality. It is achieving the project within the defined deadline that required careful scheduling, that be attained early. Due to the nature of unique repetitive construction projects, time contingency and project uncertainty are necessary for accurate scheduling. It should be integrated and flexible to accommodate the changes without adversely affecting the construction project’s total completion time. Repetitive planning and scheduling methods are more effective and essential. However, they need continuous development because of the evolution of execution methods, essent
It is considered as one of the statistical methods used to describe and estimate the relationship between randomness (Y) and explanatory variables (X). The second is the homogeneity of the variance, in which the dependent variable is a binary response takes two values (One when a specific event occurred and zero when that event did not happen) such as (injured and uninjured, married and unmarried) and that a large number of explanatory variables led to the emergence of the problem of linear multiplicity that makes the estimates inaccurate, and the method of greatest possibility and the method of declination of the letter was used in estimating A double-response logistic regression model by adopting the Jackna
... Show More 
        