In this paper, a new method of selection variables is presented to select some essential variables from large datasets. The new model is a modified version of the Elastic Net model. The modified Elastic Net variable selection model has been summarized in an algorithm. It is applied for Leukemia dataset that has 3051 variables (genes) and 72 samples. In reality, working with this kind of dataset is not accessible due to its large size. The modified model is compared to some standard variable selection methods. Perfect classification is achieved by applying the modified Elastic Net model because it has the best performance. All the calculations that have been done for this paper are in
Chemical pollution is a very important issue that people suffer from and it often affects the nature of health of society and the future of the health of future generations. Consequently, it must be considered in order to discover suitable models and find descriptions to predict the performance of it in the forthcoming years. Chemical pollution data in Iraq take a great scope and manifold sources and kinds, which brands it as Big Data that need to be studied using novel statistical methods. The research object on using Proposed Nonparametric Procedure NP Method to develop an (OCMT) test procedure to estimate parameters of linear regression model with large size of data (Big Data) which comprises many indicators associated with chemi
... Show MoreThe paper generates a geological model of a giant Middle East oil reservoir, the model constructed based on the field data of 161 wells. The main aim of the paper was to recognize the value of the reservoir to investigate the feasibility of working on the reservoir modeling prior to the final decision of the investment for further development of this oilfield. Well log, deviation survey, 2D/3D interpreted seismic structural maps, facies, and core test were utilized to construct the developed geological model based on comprehensive interpretation and correlation processes using the PETREL platform. The geological model mainly aims to estimate stock-tank oil initially in place of the reservoir. In addition, three scenarios were applie
... Show MoreA seemingly uncorrelated regression (SUR) model is a special case of multivariate models, in which the error terms in these equations are contemporaneously related. The method estimator (GLS) is efficient because it takes into account the covariance structure of errors, but it is also very sensitive to outliers. The robust SUR estimator can dealing outliers. We propose two robust methods for calculating the estimator, which are (S-Estimations, and FastSUR). We find that it significantly improved the quality of SUR model estimates. In addition, the results gave the FastSUR method superiority over the S method in dealing with outliers contained in the data set, as it has lower (MSE and RMSE) and higher (R-Squared and R-Square Adjus
... Show MoreThe estimation of the regular regression model requires several assumptions to be satisfied such as "linearity". One problem occurs by partitioning the regression curve into two (or more) parts and then joining them by threshold point(s). This situation is regarded as a linearity violation of regression. Therefore, the multiphase regression model is received increasing attention as an alternative approach which describes the changing of the behavior of the phenomenon through threshold point estimation. Maximum likelihood estimator "MLE" has been used in both model and threshold point estimations. However, MLE is not resistant against violations such as outliers' existence or in case of the heavy-tailed error distribution. The main goal of t
... Show MoreMultiple linear regressions are concerned with studying and analyzing the relationship between the dependent variable and a set of explanatory variables. From this relationship the values of variables are predicted. In this paper the multiple linear regression model and three covariates were studied in the presence of the problem of auto-correlation of errors when the random error distributed the distribution of exponential. Three methods were compared (general least squares, M robust, and Laplace robust method). We have employed the simulation studies and calculated the statistical standard mean squares error with sample sizes (15, 30, 60, 100). Further we applied the best method on the real experiment data representing the varieties of
... Show MoreTo decrease the dependency of producing high octane number gasoline on the catalytic processes in petroleum refineries and to increase the gasoline pool, the effect of adding a suggested formula of composite blending octane number enhancer to motor gasoline composed of a mixture of oxygenated materials (ethanol and ether) and aromatic materials (toluene and xylene) was investigated by design of experiments made by Mini Tab 15 statistical software. The original gasoline before addition of the octane number blending enhancer has a value of (79) research octane number (RON). The design of experiments which study the optimum volumetric percentages of the four variables, ethanol, toluene, and ether and xylene materials leads
... Show MoreIn this research, the results of x-ray diffraction method were used to determine the uniform stress deformation and microstructure parameters of CuO nanoparticles to determine the lattice strain obtained and crystallite size and then to compare the results obtained by two model Halder Wagner and Size Strain Plot with the results of these methods of the same powder using equations during which the calculation of the size of the crystallite size and lattice strain, It was found that the results obtained the values of the crystallite size (19.81nm) and the lattice strain (0.004065) of the Halder-wagner model respectively and for the ssp method were the results of the crystallite size (17.20nm) and lattice strain (0.000305) respectively. The sa
... Show MoreThe current paper proposes a new estimator for the linear regression model parameters under Big Data circumstances. From the diversity of Big Data variables comes many challenges that can be interesting to the researchers who try their best to find new and novel methods to estimate the parameters of linear regression model. Data has been collected by Central Statistical Organization IRAQ, and the child labor in Iraq has been chosen as data. Child labor is the most vital phenomena that both society and education are suffering from and it affects the future of our next generation. Two methods have been selected to estimate the parameter
... Show MoreThis paper attempted to study the effect of cutting parameters (spindle speed and feed rate) on delamination phenomena during the drilling glass-polyester composites. Drilling process was done by CNC machine with 10 mm diameter of high-speed steel (HSS) drill bit. Taguchi technique with L16 orthogonal layout was used to analyze the effective parameters on delamination factor. The optimal experiment was no. 13 with spindle speed 1273 rpm and feed 0.05 mm/rev with minimum delamination factor 1.28. &
... Show More