يدرس هذا البحث طرائق اختزال الابعاد التي تعمل على تجاوز مشكلة البعدية عندما تفشل الطرائق التقليدية في ايجاد تقدير جيد للمعلمات، لذلك يتوجب التعامل مع هذه المشكلة بشكل مباشر. ومن اجل ذلك، يجب التخلص من هذه المشكلة لذا تم استعمال اسلوبين لحل مشكلة البيانات ذات الابعاد العالية الاسلوب الاول طريقة الانحدار الشرائحي المعكوس SIR ) ) والتي تعتبر طريقة غير كلاسيكية وكذلك طريقة ( WSIR ) المقترحة والاسلوب الثاني طريقة المركبات الرئيسة ( PCA ) وهي الطريقة العامة المستخدمة في اختزال الابعاد , ان عمل طريقة انحدار الشرائحي المعكوس SIR ) ) و طريقة المركبات الرئيسة (PCA) يقوم على عمل توليفات خطية مختزلة من مجموعة جزئية من المتغيرات التوضيحية الأصلية والتي قد تعاني من مشكلة عدم التجانس ومن مشكلة التعدد الخطي بين معظم المتغيرات التوضيحية , وستقوم هذه التوليفات الجديدة المتمثلة بالمركبات الخطية الناتجة من الطريقتين بإختزال أكثر عدد من المتغيرات التوضيحية للوصول الى بُعد جديد واحد او اكثر يسمى بالبعد الفعّال . وسيتم استعمال معيار جذر متوسط مربعات الخطأ للمقارنة بين الاسلوبين لبيان افضلية الطرائق , وقد تم اجراء دراسة محاكاة للمقارنة بين الطرائق المستعملة وقد بينت نتائج المحاكاة ان طريقة weight standard Sir المقترحة هي الافضل .
Linear discriminant analysis and logistic regression are the most widely used in multivariate statistical methods for analysis of data with categorical outcome variables .Both of them are appropriate for the development of linear classification models .linear discriminant analysis has been that the data of explanatory variables must be distributed multivariate normal distribution. While logistic regression no assumptions on the distribution of the explanatory data. Hence ,It is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions.
In this paper we have been focus for the comparison between three forms for classification data belongs
... Show MoreSamples of gasoline engine oil (SAE 5W20) that had been exposed to various oxidation times were inspected with a UV-Visible (UV-Vis) spectrophotometer to select the best wavelengths and wavelength ranges for distinguishing oxidation times. Engine oil samples were subjected to different thermal oxidation periods of 0, 24, 48, 72, 96, 120, and 144 hours, resulting in a range of total base number (TBN) levels. Each wavelength (190.5 – 849.5 nm) and selected wavelength ranges were evaluated to determine the wavelength or wavelength ranges that could best distinguish among all oxidation times. The best wavelengths and wavelength ranges were analyzed with linear regression to determine the best wavelength or range to predict oxidation t
... Show MoreThe support vector machine, also known as SVM, is a type of supervised learning model that can be used for classification or regression depending on the datasets. SVM is used to classify data points by determining the best hyperplane between two or more groups. Working with enormous datasets, on the other hand, might result in a variety of issues, including inefficient accuracy and time-consuming. SVM was updated in this research by applying some non-linear kernel transformations, which are: linear, polynomial, radial basis, and multi-layer kernels. The non-linear SVM classification model was illustrated and summarized in an algorithm using kernel tricks. The proposed method was examined using three simulation datasets with different sample
... Show MoreRegression testing being expensive, requires optimization notion. Typically, the optimization of test cases results in selecting a reduced set or subset of test cases or prioritizing the test cases to detect potential faults at an earlier phase. Many former studies revealed the heuristic-dependent mechanism to attain optimality while reducing or prioritizing test cases. Nevertheless, those studies were deprived of systematic procedures to manage tied test cases issue. Moreover, evolutionary algorithms such as the genetic process often help in depleting test cases, together with a concurrent decrease in computational runtime. However, when examining the fault detection capacity along with other parameters, is required, the method falls sh
... Show MoreA mixture model is used to model data that come from more than one component. In recent years, it became an effective tool in drawing inferences about the complex data that we might come across in real life. Moreover, it can represent a tremendous confirmatory tool in classification observations based on similarities amongst them. In this paper, several mixture regression-based methods were conducted under the assumption that the data come from a finite number of components. A comparison of these methods has been made according to their results in estimating component parameters. Also, observation membership has been inferred and assessed for these methods. The results showed that the flexible mixture model outperformed the
... Show MoreA mixture model is used to model data that come from more than one component. In recent years, it became an effective tool in drawing inferences about the complex data that we might come across in real life. Moreover, it can represent a tremendous confirmatory tool in classification observations based on similarities amongst them. In this paper, several mixture regression-based methods were conducted under the assumption that the data come from a finite number of components. A comparison of these methods has been made according to their results in estimating component parameters. Also, observation membership has been inferred and assessed for these methods. The results showed that the flexible mixture model outperformed the others
... Show MoreThis research aims to provide insight into the Spatial Autoregressive Quantile Regression model (SARQR), which is more general than the Spatial Autoregressive model (SAR) and Quantile Regression model (QR) by integrating aspects of both. Since Bayesian approaches may produce reliable estimates of parameter and overcome the problems that standard estimating techniques, hence, in this model (SARQR), they were used to estimate the parameters. Bayesian inference was carried out using Markov Chain Monte Carlo (MCMC) techniques. Several criteria were used in comparison, such as root mean squared error (RMSE), mean absolute percentage error (MAPE), and coefficient of determination (R^2). The application was devoted on dataset of poverty rates acro
... Show MoreAbstract:
This research aims to compare Bayesian Method and Full Maximum Likelihood to estimate hierarchical Poisson regression model.
The comparison was done by simulation using different sample sizes (n = 30, 60, 120) and different Frequencies (r = 1000, 5000) for the experiments as was the adoption of the Mean Square Error to compare the preference estimation methods and then choose the best way to appreciate model and concluded that hierarchical Poisson regression model that has been appreciated Full Maximum Likelihood Full Maximum Likelihood with sample size (n = 30) is the best to represent the maternal mortality data after it has been reliance value param
... Show MoreSewer systems are used to convey sewage and/or storm water to sewage treatment plants for disposal by a network of buried sewer pipes, gutters, manholes and pits. Unfortunately, the sewer pipe deteriorates with time leading to the collapsing of the pipe with traffic disruption or clogging of the pipe causing flooding and environmental pollution. Thus, the management and maintenance of the buried pipes are important tasks that require information about the changes of the current and future sewer pipes conditions. In this research, the study was carried on in Baghdad, Iraq and two deteriorations model's multinomial logistic regression and neural network deterioration model NNDM are used to predict sewers future conditions. The results of the
... Show MoreCoagulation is the most important process in drinking water treatment. Alum coagulant increases the aluminum residuals, which have been linked in many studies to Alzheimer's disease. Therefore, it is very important to use it with the very optimal dose. In this paper, four sets of experiments were done to determine the relationship between raw water characteristics: turbidity, pH, alkalinity, temperature, and optimum doses of alum [ .14 O] to form a mathematical equation that could replace the need for jar test experiments. The experiments were performed under different conditions and under different seasonal circumstances. The optimal dose in every set was determined, and used to build a gene expression model (GEP). The models were co
... Show More