The idea of carrying out research on incomplete data came from the circumstances of our dear country and the horrors of war, which resulted in the missing of many important data and in all aspects of economic, natural, health, scientific life, etc.,. The reasons for the missing are different, including what is outside the will of the concerned or be the will of the concerned, which is planned for that because of the cost or risk or because of the lack of possibilities for inspection. The missing data in this study were processed using Principal Component Analysis and self-organizing map methods using simulation. The variables of child health and variables affecting children's health were taken into account: breastfeeding and maternal health. The maternal health variable contained missing value and was processed in Matlab2015a using Methods Principal Component Analysis and probabilistic Principal Component Analysis of where the missing values were processed and then the methods were compared using the root of the mean error squares. The best method to processed the missing values Was the PCA method.
This paper tackles with principal component analysis method (PCA ) to dimensionality reduction in the case of linear combinations to digital image processing and analysis. The PCA is statistical technique that shrinkages a multivariate data set consisting of inter-correlated variables into a data set consisting of variables that are uncorrelated linear combination, while ensuring the least possible loss of useful information. This method was applied to a group of satellite images of a certain area in the province of Basra, which represents the mouth of the Tigris and Euphrates rivers in the Shatt al-Arab in the province of Basra.
... Show MoreThe analysis of the classic principal components are sensitive to the outliers where they are calculated from the characteristic values and characteristic vectors of correlation matrix or variance Non-Robust, which yields an incorrect results in the case of these data contains the outliers values. In order to treat this problem, we resort to use the robust methods where there are many robust methods Will be touched to some of them.
The robust measurement estimators include the measurement of direct robust estimators for characteristic values by using characteristic vectors without relying on robust estimators for the variance and covariance matrices. Also the analysis of the princ
... Show MoreCharacteristic evolving is most serious move that deal with image discrimination. It makes the content of images as ideal as possible. Gaussian blur filter used to eliminate noise and add purity to images. Principal component analysis algorithm is a straightforward and active method to evolve feature vector and to minimize the dimensionality of data set, this paper proposed using the Gaussian blur filter to eliminate noise of images and improve the PCA for feature extraction. The traditional PCA result as total average of recall and precision are (93% ,97%) and for the improved PCA average recall and precision are (98% ,100%), this show that the improved PCA is more effective in recall and precision.
In this paper, a new hybridization of supervised principal component analysis (SPCA) and stochastic gradient descent techniques is proposed, and called as SGD-SPCA, for real large datasets that have a small number of samples in high dimensional space. SGD-SPCA is proposed to become an important tool that can be used to diagnose and treat cancer accurately. When we have large datasets that require many parameters, SGD-SPCA is an excellent method, and it can easily update the parameters when a new observation shows up. Two cancer datasets are used, the first is for Leukemia and the second is for small round blue cell tumors. Also, simulation datasets are used to compare principal component analysis (PCA), SPCA, and SGD-SPCA. The results sh
... Show MoreThe objective of the study is to demonstrate the predictive ability is better between the logistic regression model and Linear Discriminant function using the original data first and then the Home vehicles to reduce the dimensions of the variables for data and socio-economic survey of the family to the province of Baghdad in 2012 and included a sample of 615 observation with 13 variable, 12 of them is an explanatory variable and the depended variable is number of workers and the unemployed.
Was conducted to compare the two methods above and it became clear by comparing the logistic regression model best of a Linear Discriminant function written
... Show MoreSewage water is a mixture of water and solids added to water for various uses, so it needs to be treated to meet local or global standards for environmentally friendly waste production. The present study aimed to analyze the new Maaymyrh sewage treatment plant's quality parameters statistically at Hilla city. The plant is designed to serve 500,000 populations, and it is operating on a biological treatment method (Activated Sludge Process) with an average wastewater inflow of 107,000m3/day. Wastewater data were collected daily by the Mayoralty of Hilla from November 2019 to June 2020 from the influent and effluent in the (STP) new in Maaymyrh for five water quality standards, such as (BOD5), (COD), (TSS), (TP)
... Show MoreThis paper proposed a new method to study functional non-parametric regression data analysis with conditional expectation in the case that the covariates are functional and the Principal Component Analysis was utilized to de-correlate the multivariate response variables. It utilized the formula of the Nadaraya Watson estimator (K-Nearest Neighbour (KNN)) for prediction with different types of the semi-metrics, (which are based on Second Derivative and Functional Principal Component Analysis (FPCA)) for measureing the closeness between curves. Root Mean Square Errors is used for the implementation of this model which is then compared to the independent response method. R program is used for analysing data. Then, when the cov
... Show MoreThis study was conducted to determining the variable effects on water quality of Greater Zab River in Erbil province, Iraq, using multivariate statistical analysis. Seventeen variables were monitored in four sampling sites during one year (from May 2012 to April 2013). The dataset were treated using principal component analysis (PCA)/ factor analysis (FA), cluster analysis (CA) to the most important factors affecting water quality, sources of pollution and suitability of water for drinking consumption and irrigation. Six factors were identified as responsible for the data structure explaining 73.5% of the total variance in the dataset and are conditionally named, hydrochemical from weathering, mineral salts and domestic wastes. CA showed
... Show MoreAbstract
The methods of the Principal Components and Partial Least Squares can be regard very important methods in the regression analysis, whe
... Show More