Researchers have increased interest in recent years in determining the optimum sample size to obtain sufficient accuracy and estimation and to obtain high-precision parameters in order to evaluate a large number of tests in the field of diagnosis at the same time. In this research, two methods were used to determine the optimum sample size to estimate the parameters of high-dimensional data. These methods are the Bennett inequality method and the regression method. The nonlinear logistic regression model is estimated by the size of each sampling method in high-dimensional data using artificial intelligence, which is the method of artificial neural network (ANN) as it gives a high-precision estimate commensurate with the data type and type of medical study. The probabilistic values obtained from the artificial neural network are used to calculate the net reclassification index (NRI). A program was written for this purpose using the statistical programming language (R), where the mean maximum absolute error criterion (MME) of the net reclassification network index (NRI) was used to compare the methods of specifying the sample size and the presence of the number of different default parameters in light of the value of a specific error margin (ε). To verify the performance of the methods using the comparison criteria above were the most important conclusions were that the Bennett inequality method is the best in determining the optimum sample size according to the number of default parameters and the error margin value
Abstract
The problem of missing data represents a major obstacle before researchers in the process of data analysis in different fields since , this problem is a recurrent one in all fields of study including social , medical , astronomical and clinical experiments .
The presence of such a problem within the data to be studied may influence negatively on the analysis and it may lead to misleading conclusions , together with the fact that these conclusions that result from a great bias caused by that problem in spite of the efficiency of wavelet methods but they are also affected by the missing of data , in addition to the impact of the problem of miss of accuracy estimation
... Show More
Abstract
Rayleigh distribution is one of the important distributions used for analysis life time data, and has applications in reliability study and physical interpretations. This paper introduces four different methods to estimate the scale parameter, and also estimate reliability function; these methods are Maximum Likelihood, and Bayes and Modified Bayes, and Minimax estimator under squared error loss function, for the scale and reliability function of the generalized Rayleigh distribution are obtained. The comparison is done through simulation procedure, t
... Show MoreThe technology of reducing dimensions and choosing variables are very important topics in statistical analysis to multivariate. When two or more of the predictor variables are linked in the complete or incomplete regression relationships, a problem of multicollinearity are occurred which consist of the breach of one basic assumptions of the ordinary least squares method with incorrect estimates results.
There are several methods proposed to address this problem, including the partial least squares (PLS), used to reduce dimensional regression analysis. By using linear transformations that convert a set of variables associated with a high link to a set of new independent variables and unr
... Show MoreIn recent years, the attention of researchers has increased of semi-parametric regression models, because it is possible to integrate the parametric and non-parametric regression models in one and then form a regression model has the potential to deal with the cruse of dimensionality in non-parametric models that occurs through the increasing of explanatory variables. Involved in the analysis and then decreasing the accuracy of the estimation. As well as the privilege of this type of model with flexibility in the application field compared to the parametric models which comply with certain conditions such as knowledge of the distribution of errors or the parametric models may
... Show MoreA mixture model is used to model data that come from more than one component. In recent years, it became an effective tool in drawing inferences about the complex data that we might come across in real life. Moreover, it can represent a tremendous confirmatory tool in classification observations based on similarities amongst them. In this paper, several mixture regression-based methods were conducted under the assumption that the data come from a finite number of components. A comparison of these methods has been made according to their results in estimating component parameters. Also, observation membership has been inferred and assessed for these methods. The results showed that the flexible mixture model outperformed the others
... Show MoreThe analysis of survival and reliability considered of topics and methods of vital statistics at the present time because of their importance in the various demographical, medical, industrial and engineering fields. This research focused generate random data for samples from the probability distribution Generalized Gamma: GG, known as: "Inverse Transformation" Method: ITM, which includes the distribution cycle integration function incomplete Gamma integration making it more difficult classical estimation so will be the need to illustration to the method of numerical approximation and then appreciation of the function of survival function. It was estimated survival function by simulation the way "Monte Carlo". The Entropy method used for the
... Show MoreLinear regression is one of the most important statistical tools through which it is possible to know the relationship between the response variable and one variable (or more) of the independent variable(s), which is often used in various fields of science. Heteroscedastic is one of the linear regression problems, the effect of which leads to inaccurate conclusions. The problem of heteroscedastic may be accompanied by the presence of extreme outliers in the independent variables (High leverage points) (HLPs), the presence of (HLPs) in the data set result unrealistic estimates and misleading inferences. In this paper, we review some of the robust
... Show MoreAbstract
Binary logistic regression model used in data classification and it is the strongest most flexible tool in study cases variable response binary when compared to linear regression. In this research, some classic methods were used to estimate parameters binary logistic regression model, included the maximum likelihood method, minimum chi-square method, weighted least squares, with bayes estimation , to choose the best method of estimation by default values to estimate parameters according two different models of general linear regression models ,and different s
... Show MoreIn this paper, the Monte-Carlo simulation method was used to compare the robust circular S estimator with the circular Least squares method in the case of no outlier data and in the case of the presence of an outlier in the data through two trends, the first is contaminant with high inflection points that represents contaminant in the circular independent variable, and the second the contaminant in the vertical variable that represents the circular dependent variable using three comparison criteria, the median standard error (Median SE), the median of the mean squares of error (Median MSE), and the median of the mean cosines of the circular residuals (Median A(k)). It was concluded that the method of least squares is better than the
... Show MoreThe drive of this exploration is to investigate the mucoadhesive assets of A. indica (Azadirachta indica) fruit mucilage by incorporating it into mucoadhesive microspheres with Acyclovir (AVR) as a model drug. The study was performed to check the impact of the mucilage proportion on particle size and swelling index. Nine batches of AVR mucoadhesive microspheres were made with varying proportions of Polyacrylic acid 934P and A. indica fruit mucilage (AIFM). A central composite design with design expert software to check the impact of dependent variables (A. indica mucilage and Polyacrylic acid 934 P levels) on particle size and swelling index as a response. As part of congeniality studies, the batches w
... Show More