Data mining is a data analysis process using software to find certain patterns or rules in a large amount of data, which is expected to provide knowledge to support decisions. However, missing value in data mining often leads to a loss of information. The purpose of this study is to improve the performance of data classification with missing values, precisely and accurately. The test method is carried out using the Car Evaluation dataset from the UCI Machine Learning Repository. RStudio and RapidMiner tools were used for testing the algorithm. This study will result in a data analysis of the tested parameters to measure the performance of the algorithm. Using test variations: performance at C5.0, C4.5, and k-NN at 0% missing rate, performance at C5.0, C4.5, and k-NN at 5–50% missing rate, performance at C5.0 + k-NNI, C4.5 + k-NNI, and k-NN + k-NNI classifier at 5–50% missing rate, and performance at C5.0 + CMI, C4.5 + CMI, and k-NN + CMI classifier at 5–50% missing rate, The results show that C5.0 with k-NNI produces better classification accuracy than other tested imputation and classification algorithms. For example, with 35% of the dataset missing, this method obtains 93.40% validation accuracy and 92% test accuracy. C5.0 with k-NNI also offers fast processing times compared with other methods.
The Weibull distribution is considered one of the Type-I Generalized Extreme Value (GEV) distribution, and it plays a crucial role in modeling extreme events in various fields, such as hydrology, finance, and environmental sciences. Bayesian methods play a strong, decisive role in estimating the parameters of the GEV distribution due to their ability to incorporate prior knowledge and handle small sample sizes effectively. In this research, we compare several shrinkage Bayesian estimation methods based on the squared error and the linear exponential loss functions. They were adopted and compared by the Monte Carlo simulation method. The performance of these methods is assessed based on their accuracy and computational efficiency in estimati
... Show MoreOpenStreetMap (OSM) represents the most common example of online volunteered mapping applications. Most of these platforms are open source spatial data collected by non-experts volunteers using different data collection methods. OSM project aims to provide a free digital map for all the world. The heterogeneity in data collection methods made OSM project databases accuracy is unreliable and must be dealt with caution for any engineering application. This study aims to assess the horizontal positional accuracy of three spatial data sources are OSM road network database, high-resolution Satellite Image (SI), and high-resolution Aerial Photo (AP) of Baghdad city with respect to an analogue formal road network dataset obtain
... Show MoreWell integrity is a vital feature that should be upheld into the lifespan of the well, and one constituent of which casing, necessity to be capable to endure all the interior and outside loads. The casing, through its two basic essentials: casing design and casing depth adjustment, are fundamental to a unique wellbore that plays an important role in well integrity. Casing set depths are determined based on fracturing pressure and pore pressure in the well and can usually be obtained from well-specific information. Based on the analyzes using the improved techniques in this study, the following special proposition can be projected: The selection of the first class and materials must be done correctly and accurately in accordance with
... Show MoreWell integrity is a vital feature that should be upheld into the lifespan of the well, and one constituent of which casing, necessity to be capable to endure all the interior and outside loads. The casing, through its two basic essentials: casing design and casing depth adjustment, are fundamental to a unique wellbore that plays an important role in well integrity. Casing set depths are determined based on fracturing pressure and pore pressure in the well and can usually be obtained from well-specific information. Based on the analyzes using the improved techniques in this study, the following special proposition can be projected: The selection of the first class and materials must be done correctly and accurately in accordance with the
... Show MoreBack ground: During acrylic resin processing, the mold must be separated from the surface of the gypsum to prevent liquid resin from penetrating into the gypsum, and water from the gypsum seeping into the acrylic resin. For many years, tin foil was the most acceptable separating medium, and because it's difficult to apply, a tin-foil substitute is used. In this study, olive oil is used as an alternative to tin foil separating medium for first time, so the aim of the study was to evaluate its effect as a separating medium on some physical properties such as (surface roughness, water sorption and solubility) of acrylic resins denture base comparing it with those processed using tin-foil and tin foil substitute such as (cold mold seal) separat
... Show MoreThe study seeks to determine the levels of credit structure (independent variable) depending on its components (loans, credit disseminate, other facilities) To get the eight patterns of the structure of bank credit for the purpose of assessing the relationship between changes in levels of each style of structure credit (increase or decrease) and reflected in maximizing the value of the Bank(The adopted a measured variable depending on the approximate equation of simple Tobin's Q) to determine the style that achieves the highest value of the Bank, to take advantage of it in management, planning and control by knowing the strengths and weaknesses of the historical distribution of the facilities . the sample of the
... Show MoreThe issue of penalized regression model has received considerable critical attention to variable selection. It plays an essential role in dealing with high dimensional data. Arctangent denoted by the Atan penalty has been used in both estimation and variable selection as an efficient method recently. However, the Atan penalty is very sensitive to outliers in response to variables or heavy-tailed error distribution. While the least absolute deviation is a good method to get robustness in regression estimation. The specific objective of this research is to propose a robust Atan estimator from combining these two ideas at once. Simulation experiments and real data applications show that the proposed LAD-Atan estimator
... Show MoreThe issue of penalized regression model has received considerable critical attention to variable selection. It plays an essential role in dealing with high dimensional data. Arctangent denoted by the Atan penalty has been used in both estimation and variable selection as an efficient method recently. However, the Atan penalty is very sensitive to outliers in response to variables or heavy-tailed error distribution. While the least absolute deviation is a good method to get robustness in regression estimation. The specific objective of this research is to propose a robust Atan estimator from combining these two ideas at once. Simulation experiments and real data applications show that the p
... Show MoreThe Compressional-wave (Vp) data are useful for reservoir exploration, drilling operations, stimulation, hydraulic fracturing employment, and development plans for a specific reservoir. Due to the different nature and behavior of the influencing parameters, more complex nonlinearity exists for Vp modeling purposes. In this study, a statistical relationship between compressional wave velocity and petrophysical parameters was developed from wireline log data for Jeribe formation in Fauqi oil field south Est Iraq, which is studied using single and multiple linear regressions. The model concentrated on predicting compressional wave velocity from petrophysical parameters and any pair of shear waves velocity, porosity, density, a
... Show More