Variable selection in Poisson regression with high dimensional data has been widely used in recent years. we proposed in this paper using a penalty function that depends on a function named a penalty. An Atan estimator was compared with Lasso and adaptive lasso. A simulation and application show that an Atan estimator has the advantage in the estimation of coefficient and variables selection.
In this paper, a new method of selection variables is presented to select some essential variables from large datasets. The new model is a modified version of the Elastic Net model. The modified Elastic Net variable selection model has been summarized in an algorithm. It is applied for Leukemia dataset that has 3051 variables (genes) and 72 samples. In reality, working with this kind of dataset is not accessible due to its large size. The modified model is compared to some standard variable selection methods. Perfect classification is achieved by applying the modified Elastic Net model because it has the best performance. All the calculations that have been done for this paper are in
In this research, we use fuzzy nonparametric methods based on some smoothing techniques, were applied to real data on the Iraqi stock market especially the data about Baghdad company for soft drinks for the year (2016) for the period (1/1/2016-31/12/2016) .A sample of (148) observations was obtained in order to construct a model of the relationship between the stock prices (Low, high, modal) and the traded value by comparing the results of the criterion (G.O.F.) for three techniques , we note that the lowest value for this criterion was for the K-Nearest Neighbor at Gaussian function .
In this paper, compared eight methods for generating the initial value and the impact of these methods to estimate the parameter of a autoregressive model, as was the use of three of the most popular methods to estimate the model and the most commonly used by researchers MLL method, Barg method and the least squares method and that using the method of simulation model first order autoregressive through the design of a number of simulation experiments and the different sizes of the samples.
In this paper, we will provide a proposed method to estimate missing values for the Explanatory variables for Non-Parametric Multiple Regression Model and compare it with the Imputation Arithmetic mean Method, The basis of the idea of this method was based on how to employ the causal relationship between the variables in finding an efficient estimate of the missing value, we rely on the use of the Kernel estimate by Nadaraya – Watson Estimator , and on Least Squared Cross Validation (LSCV) to estimate the Bandwidth, and we use the simulation study to compare between the two methods.
This research a study model of linear regression problem of autocorrelation of random error is spread when a normal distribution as used in linear regression analysis for relationship between variables and through this relationship can predict the value of a variable with the values of other variables, and was comparing methods (method of least squares, method of the average un-weighted, Thiel method and Laplace method) using the mean square error (MSE) boxes and simulation and the study included fore sizes of samples (15, 30, 60, 100). The results showed that the least-squares method is best, applying the fore methods of buckwheat production data and the cultivated area of the provinces of Iraq for years (2010), (2011), (2012),
... Show MoreChemical pollution is a very important issue that people suffer from and it often affects the nature of health of society and the future of the health of future generations. Consequently, it must be considered in order to discover suitable models and find descriptions to predict the performance of it in the forthcoming years. Chemical pollution data in Iraq take a great scope and manifold sources and kinds, which brands it as Big Data that need to be studied using novel statistical methods. The research object on using Proposed Nonparametric Procedure NP Method to develop an (OCMT) test procedure to estimate parameters of linear regression model with large size of data (Big Data) which comprises many indicators associated with chemi
... Show MoreThe last few years witnessed great and increasing use in the field of medical image analysis. These tools helped the Radiologists and Doctors to consult while making a particular diagnosis. In this study, we used the relationship between statistical measurements, computer vision, and medical images, along with a logistic regression model to extract breast cancer imaging features. These features were used to tell the difference between the shape of a mass (Fibroid vs. Fatty) by looking at the regions of interest (ROI) of the mass. The final fit of the logistic regression model showed that the most important variables that clearly affect breast cancer shape images are Skewness, Kurtosis, Center of mass, and Angle, with an AUCROC of
... Show More