In this paper, a new hybridization of supervised principal component analysis (SPCA) and stochastic gradient descent techniques is proposed, and called as SGD-SPCA, for real large datasets that have a small number of samples in high dimensional space. SGD-SPCA is proposed to become an important tool that can be used to diagnose and treat cancer accurately. When we have large datasets that require many parameters, SGD-SPCA is an excellent method, and it can easily update the parameters when a new observation shows up. Two cancer datasets are used, the first is for Leukemia and the second is for small round blue cell tumors. Also, simulation datasets are used to compare principal component analysis (PCA), SPCA, and SGD-SPCA. The results sh
... Show MoreThe objective of the study is to demonstrate the predictive ability is better between the logistic regression model and Linear Discriminant function using the original data first and then the Home vehicles to reduce the dimensions of the variables for data and socio-economic survey of the family to the province of Baghdad in 2012 and included a sample of 615 observation with 13 variable, 12 of them is an explanatory variable and the depended variable is number of workers and the unemployed.
Was conducted to compare the two methods above and it became clear by comparing the logistic regression model best of a Linear Discriminant function written
... Show MoreThe idea of carrying out research on incomplete data came from the circumstances of our dear country and the horrors of war, which resulted in the missing of many important data and in all aspects of economic, natural, health, scientific life, etc.,. The reasons for the missing are different, including what is outside the will of the concerned or be the will of the concerned, which is planned for that because of the cost or risk or because of the lack of possibilities for inspection. The missing data in this study were processed using Principal Component Analysis and self-organizing map methods using simulation. The variables of child health and variables affecting children's health were taken into account: breastfeed
... Show MoreCharacteristic evolving is most serious move that deal with image discrimination. It makes the content of images as ideal as possible. Gaussian blur filter used to eliminate noise and add purity to images. Principal component analysis algorithm is a straightforward and active method to evolve feature vector and to minimize the dimensionality of data set, this paper proposed using the Gaussian blur filter to eliminate noise of images and improve the PCA for feature extraction. The traditional PCA result as total average of recall and precision are (93% ,97%) and for the improved PCA average recall and precision are (98% ,100%), this show that the improved PCA is more effective in recall and precision.
Sewage water is a mixture of water and solids added to water for various uses, so it needs to be treated to meet local or global standards for environmentally friendly waste production. The present study aimed to analyze the new Maaymyrh sewage treatment plant's quality parameters statistically at Hilla city. The plant is designed to serve 500,000 populations, and it is operating on a biological treatment method (Activated Sludge Process) with an average wastewater inflow of 107,000m3/day. Wastewater data were collected daily by the Mayoralty of Hilla from November 2019 to June 2020 from the influent and effluent in the (STP) new in Maaymyrh for five water quality standards, such as (BOD5), (COD), (TSS), (TP)
... Show MoreThis study was conducted to determining the variable effects on water quality of Greater Zab River in Erbil province, Iraq, using multivariate statistical analysis. Seventeen variables were monitored in four sampling sites during one year (from May 2012 to April 2013). The dataset were treated using principal component analysis (PCA)/ factor analysis (FA), cluster analysis (CA) to the most important factors affecting water quality, sources of pollution and suitability of water for drinking consumption and irrigation. Six factors were identified as responsible for the data structure explaining 73.5% of the total variance in the dataset and are conditionally named, hydrochemical from weathering, mineral salts and domestic wastes. CA showed
... Show MoreMany fuzzy clustering are based on within-cluster scatter with a compactness measure , but in this paper explaining new fuzzy clustering method which depend on within-cluster scatter with a compactness measure and between-cluster scatter with a separation measure called the fuzzy compactness and separation (FCS). The fuzzy linear discriminant analysis (FLDA) based on within-cluster scatter matrix and between-cluster scatter matrix . Then two fuzzy scattering matrices in the objective function assure the compactness between data elements and cluster centers .To test the optimal number of clusters using validation clustering method is discuss .After that an illustrate example are applied.
This paper tackles with principal component analysis method (PCA ) to dimensionality reduction in the case of linear combinations to digital image processing and analysis. The PCA is statistical technique that shrinkages a multivariate data set consisting of inter-correlated variables into a data set consisting of variables that are uncorrelated linear combination, while ensuring the least possible loss of useful information. This method was applied to a group of satellite images of a certain area in the province of Basra, which represents the mouth of the Tigris and Euphrates rivers in the Shatt al-Arab in the province of Basra.
... Show MoreThis paper proposed a new method to study functional non-parametric regression data analysis with conditional expectation in the case that the covariates are functional and the Principal Component Analysis was utilized to de-correlate the multivariate response variables. It utilized the formula of the Nadaraya Watson estimator (K-Nearest Neighbour (KNN)) for prediction with different types of the semi-metrics, (which are based on Second Derivative and Functional Principal Component Analysis (FPCA)) for measureing the closeness between curves. Root Mean Square Errors is used for the implementation of this model which is then compared to the independent response method. R program is used for analysing data. Then, when the cov
... Show More