Classification of imbalanced data is an important issue. Many algorithms have been developed for classification, such as Back Propagation (BP) neural networks, decision tree, Bayesian networks etc., and have been used repeatedly in many fields. These algorithms speak of the problem of imbalanced data, where there are situations that belong to more classes than others. Imbalanced data result in poor performance and bias to a class without other classes. In this paper, we proposed three techniques based on the Over-Sampling (O.S.) technique for processing imbalanced dataset and redistributing it and converting it into balanced dataset. These techniques are (Improved Synthetic Minority Over-Sampling Technique (Improved SMOTE), Borderline-SMOTE + Imbalanced Ratio(IR), Adaptive Synthetic Sampling (ADASYN) +IR) Algorithm, where the work these techniques are generate the synthetic samples for the minority class to achieve balance between minority and majority classes and then calculate the IR between classes of minority and majority. Experimental results show ImprovedSMOTE algorithm outperform the Borderline-SMOTE + IR and ADASYN + IR algorithms because it achieves a high balance between minority and majority classes.
The aim of t his p aper is t o const ruct t he (k,r)-caps in t he p rojective 3-sp ace PG(3,p ) over Galois field GF(4). We found t hat t he maximum comp let e (k,2)-cap which is called an ovaloid, exist s in PG(3,4) when k = 13. Moreover t he maximum (k,3)-cap s, (k,4)-cap s and (k,5)-caps.
A mathematical model was proposed to study the microkinetics of esterification reaction of oleic acid with ethanol over prepared HY zeolite catalyst. The catalyst was prepared from Iraqi kaolin source and its properties were characterized by different techniques. The esterification was done under different temperature (40 to 70˚C) with 6:1 for molar ratio of ethanol to oleic acid and 5 % catalyst loading.
The microkinetics study was done over two period of time each period was examined individually to calculate the reaction rate constant and activation energy. The impact of the mass transfer resistance to the reactant was also investigated; two different studies have been accomplished to do this purpose.
&nb
... Show MoreTo describe changes in attitudes and expectations of labor over the previous six decades, comparing the Iraqi generation who labored at home without medical assistance with their descendants.
We used semi‐structured telephone interviews with 22 women across three generations of one extended family living and giving birth in Iraq between the 1950s and the 2010s. Qualitative data were analyzed thematically using open, axial, and selective coding.
Each generation experienced a paradigm shift in childbirth, from exclus
Increasing demands on producing environmentally friendly products are becoming a driving force for designing highly active catalysts. Thus, surfaces that efficiently catalyse the nitrogen reduction reactions are greatly sought in moderating air-pollutant emissions. This contribution aims to computationally investigate the hydrodenitrogenation (HDN) networks of pyridine over the γ-Mo2N(111) surface using a density functional theory (DFT) approach. Various adsorption configurations have been considered for the molecularly adsorbed pyridine. Findings indicate that pyridine can be adsorbed via side-on and end-on modes in six geometries in which one adsorption site is revealed to have the lowest adsorption energy (–45.3 kcal/mol). Over a nitr
... Show MoreThe logistic regression model regarded as the important regression Models ,where of the most interesting subjects in recent studies due to taking character more advanced in the process of statistical analysis .
The ordinary estimating methods is failed in dealing with data that consist of the presence of outlier values and hence on the absence of such that have undesirable effect on the result. &nbs
... Show MoreIn order to obtain a mixed model with high significance and accurate alertness, it is necessary to search for the method that performs the task of selecting the most important variables to be included in the model, especially when the data under study suffers from the problem of multicollinearity as well as the problem of high dimensions. The research aims to compare some methods of choosing the explanatory variables and the estimation of the parameters of the regression model, which are Bayesian Ridge Regression (unbiased) and the adaptive Lasso regression model, using simulation. MSE was used to compare the methods.
Abstract
The problem of missing data represents a major obstacle before researchers in the process of data analysis in different fields since , this problem is a recurrent one in all fields of study including social , medical , astronomical and clinical experiments .
The presence of such a problem within the data to be studied may influence negatively on the analysis and it may lead to misleading conclusions , together with the fact that these conclusions that result from a great bias caused by that problem in spite of the efficiency of wavelet methods but they are also affected by the missing of data , in addition to the impact of the problem of miss of accuracy estimation
... Show MoreThe question of estimation took a great interest in some engineering, statistical applications, various applied, human sciences, the methods provided by it helped to identify and accurately the many random processes.
In this paper, methods were used through which the reliability function, risk function, and estimation of the distribution parameters were used, and the methods are (Moment Method, Maximum Likelihood Method), where an experimental study was conducted using a simulation method for the purpose of comparing the methods to show which of these methods are competent in practical application This is based on the observations generated from the Rayleigh logarithmic distribution (RL) with sample sizes
... Show MoreMany of the dynamic processes in different sciences are described by models of differential equations. These models explain the change in the behavior of the studied process over time by linking the behavior of the process under study with its derivatives. These models often contain constant and time-varying parameters that vary according to the nature of the process under study in this We will estimate the constant and time-varying parameters in a sequential method in several stages. In the first stage, the state variables and their derivatives are estimated in the method of penalized splines(p- splines) . In the second stage we use pseudo lest square to estimate constant parameters, For the third stage, the rem
... Show MoreIs in this research review of the way minimum absolute deviations values based on linear programming method to estimate the parameters of simple linear regression model and give an overview of this model. We were modeling method deviations of the absolute values proposed using a scale of dispersion and composition of a simple linear regression model based on the proposed measure. Object of the work is to find the capabilities of not affected by abnormal values by using numerical method and at the lowest possible recurrence.