In this paper, a new hybridization of supervised principal component analysis (SPCA) and stochastic gradient descent techniques is proposed, and called as SGD-SPCA, for real large datasets that have a small number of samples in high dimensional space. SGD-SPCA is proposed to become an important tool that can be used to diagnose and treat cancer accurately. When we have large datasets that require many parameters, SGD-SPCA is an excellent method, and it can easily update the parameters when a new observation shows up. Two cancer datasets are used, the first is for Leukemia and the second is for small round blue cell tumors. Also, simulation datasets are used to compare principal component analysis (PCA), SPCA, and SGD-SPCA. The results show that SGD-SPCA is more efficient than other existing methods.
In the present paper, Arabic Character Recognition Edge detection method based on contour and connected components is proposed. First stage contour extraction feature is introduced to tackle the Arabic characters edge detection problem, where the aim is to extract the edge information presented in the Arabic characters, since it is crucial to understand the character content. The second stage connected components appling for the same characters to find edge detection. The proposed approach exploits a number of connected components, which move on the character by character intensity values, to establish matrix, which represents the edge information at each pixel location .
... Show MoreThe article attempts to provide the theoretical fundamentals for the semiotic component of lacunae in a language. The definitions of the following notions have been represented: 1) lacunarity as a complex phenomenon that works in the modes of language, speech and speech behaviour; 2) lacuna as a gap, an empty space, which lacks something, which may be characteristic of a written work. The author of the article considers the main classes of lacunae, among which are: 1) generic lacunae, which reflect the absence of a common name for a class of objects; 2) species lacunae are the absence of specific names, names of individual types of objects or phenomena; 3) intralingual lacunae are found within the paradigms of one language; 4) interlanguage
... Show MoreSupport Vector Machines (SVMs) are supervised learning models used to examine data sets in order to classify or predict dependent variables. SVM is typically used for classification by determining the best hyperplane between two classes. However, working with huge datasets can lead to a number of problems, including time-consuming and inefficient solutions. This research updates the SVM by employing a stochastic gradient descent method. The new approach, the extended stochastic gradient descent SVM (ESGD-SVM), was tested on two simulation datasets. The proposed method was compared with other classification approaches such as logistic regression, naive model, K Nearest Neighbors and Random Forest. The results show that the ESGD-SVM has a
... Show MoreThe coronavirus is a family of viruses that cause different dangerous diseases that lead to death. Two types of this virus have been previously found: SARS-CoV, which causes a severe respiratory syndrome, and MERS-CoV, which causes a respiratory syndrome in the Middle East. The latest coronavirus, originated in the Chinese city of Wuhan, is known as the COVID-19 pandemic. It is a new kind of coronavirus that can harm people and was first discovered in Dec. 2019. According to the statistics of the World Health Organization (WHO), the number of people infected with this serious disease has reached more than seven million people from all over the world. In Iraq, the number of people infected has reached more than tw
... Show MoreData generated from modern applications and the internet in healthcare is extensive and rapidly expanding. Therefore, one of the significant success factors for any application is understanding and extracting meaningful information using digital analytics tools. These tools will positively impact the application's performance and handle the challenges that can be faced to create highly consistent, logical, and information-rich summaries. This paper contains three main objectives: First, it provides several analytics methodologies that help to analyze datasets and extract useful information from them as preprocessing steps in any classification model to determine the dataset characteristics. Also, this paper provides a comparative st
... Show MoreTraffic classification is referred to as the task of categorizing traffic flows into application-aware classes such as chats, streaming, VoIP, etc. Most systems of network traffic identification are based on features. These features may be static signatures, port numbers, statistical characteristics, and so on. Current methods of data flow classification are effective, they still lack new inventive approaches to meet the needs of vital points such as real-time traffic classification, low power consumption, ), Central Processing Unit (CPU) utilization, etc. Our novel Fast Deep Packet Header Inspection (FDPHI) traffic classification proposal employs 1 Dimension Convolution Neural Network (1D-CNN) to automatically learn more representational c
... Show MoreWater pollution has created a critical threat to the environment. A lot of research has been done recently to use surface-enhanced Raman spectroscopy (SERS) to detect multiple pollutants in water. This study aims to use Ag colloid nanoflowers as liquid SERS enhancer. Tri sodium phosphate (Na3PO4) was investigated as a pollutant using liquid SERS based on colloidal Ag nanoflowers. The chemical method was used to synthesize nanoflowers from silver ions. Atomic Force Microscope (AFM), Scanning Electron Microscope (SEM), and X-ray diffractometer (XRD) were employed to characterize the silver nanoflowers. This nanoflowers SERS action in detecting Na3PO4 was reported and analyzed
... Show MoreThis research deals with a shrinking method concernes with the principal components similar to that one which used in the multiple regression “Least Absolute Shrinkage and Selection: LASS”. The goal here is to make an uncorrelated linear combinations from only a subset of explanatory variables that may have a multicollinearity problem instead taking the whole number say, (K) of them. This shrinkage will force some coefficients to equal zero, after making some restriction on them by some "tuning parameter" say, (t) which balances the bias and variance amount from side, and doesn't exceed the acceptable percent explained variance of these components. This had been shown by MSE criterion in the regression case and the percent explained v
... Show More