Frequent data in weather records is essential for forecasting, numerical model development, and research, but data recording interruptions may occur for various reasons. So, this study aims to find a way to treat these missing data and know their accuracy by comparing them with the original data values. The mean method was used to treat daily and monthly missing temperature data. The results show that treating the monthly temperature data for the stations (Baghdad, Hilla, Basra, Nasiriya, and Samawa) in Iraq for all periods (1980-2020), the percentage for matching between the original and the treating values did not exceed (80%). So, the period was divided into four periods. It was noted that most of the congruence values increased, re
... Show MoreThe current study aims to compare between the assessments of the Rush model’s parameters to the missing and completed data in various ways of processing the missing data. To achieve the aim of the present study, the researcher followed the following steps: preparing Philip Carter test for the spatial capacity which consists of (20) items on a group of (250) sixth scientific stage students in the directorates of Baghdad Education at Al–Rusafa (1st, 2nd and 3rd) for the academic year (2018-2019). Then, the researcher relied on a single-parameter model to analyze the data. The researcher used Bilog-mg3 model to check the hypotheses, data and match them with the model. In addition
... Show MoreMost of the medical datasets suffer from missing data, due to the expense of some tests or human faults while recording these tests. This issue affects the performance of the machine learning models because the values of some features will be missing. Therefore, there is a need for a specific type of methods for imputing these missing data. In this research, the salp swarm algorithm (SSA) is used for generating and imputing the missing values in the pain in my ass (also known Pima) Indian diabetes disease (PIDD) dataset, the proposed algorithm is called (ISSA). The obtained results showed that the classification performance of three different classifiers which are support vector machine (SVM), K-nearest neighbour (KNN), and Naïve B
... Show MoreHeart sound is an electric signal affected by some factors during the signal's recording process, which adds unwanted information to the signal. Recently, many studies have been interested in noise removal and signal recovery problems. The first step in signal processing is noise removal; many filters are used and proposed for treating this problem. Here, the Hankel matrix is implemented from a given signal and tries to clean the signal by overcoming unwanted information from the Hankel matrix. The first step is detecting unwanted information by defining a binary operator. This operator is defined under some threshold. The unwanted information replaces by zero, and the wanted information keeping in the estimated matrix. The resulting matrix
... Show MoreMissing data is one of the problems that may occur in regression models. This problem is usually handled by deletion mechanism available in statistical software. This method reduces statistical inference values because deletion affects sample size. In this paper, Expectation Maximization algorithm (EM), Multicycle-Expectation-Conditional Maximization algorithm (MC-ECM), Expectation-Conditional Maximization Either (ECME), and Recurrent Neural Networks (RNN) are used to estimate multiple regression models when explanatory variables have some missing values. Experimental dataset were generated using Visual Basic programming language with missing values of explanatory variables according to a missing mechanism at random general pattern and s
... Show MoreIn this paper, we will provide a proposed method to estimate missing values for the Explanatory variables for Non-Parametric Multiple Regression Model and compare it with the Imputation Arithmetic mean Method, The basis of the idea of this method was based on how to employ the causal relationship between the variables in finding an efficient estimate of the missing value, we rely on the use of the Kernel estimate by Nadaraya – Watson Estimator , and on Least Squared Cross Validation (LSCV) to estimate the Bandwidth, and we use the simulation study to compare between the two methods.
This paper presents a hybrid approach for solving null values problem; it hybridizes rough set theory with intelligent swarm algorithm. The proposed approach is a supervised learning model. A large set of complete data called learning data is used to find the decision rule sets that then have been used in solving the incomplete data problem. The intelligent swarm algorithm is used for feature selection which represents bees algorithm as heuristic search algorithm combined with rough set theory as evaluation function. Also another feature selection algorithm called ID3 is presented, it works as statistical algorithm instead of intelligent algorithm. A comparison between those two approaches is made in their performance for null values estima
... Show MoreThe research aims to estimate missing values using covariance analysis method Coons way to the variable response or dependent variable that represents the main character studied in a type of multi-factor designs experiments called split block-design (SBED) so as to increase the accuracy of the analysis results and the accuracy of statistical tests based on this type of designs. as it was noted in the theoretical aspect to the design of dissident sectors and statistical analysis have to analyze the variation in the experience of experiment )SBED) and the use of covariance way coons analysis according to two methods to estimate the missing value, either in the practical side of it has been implemented field experiment wheat crop in
... Show MoreIn this paper, ARIMA model was used for Estimating the missing data(air temperature, relative humidity, wind speed) for mean monthly variables in different time series at three stations (Sinjar, Baghdad , AL.Hai) which represented different parts of Iraq from north to south respectively
In this study, we made a comparison between LASSO & SCAD methods, which are two special methods for dealing with models in partial quantile regression. (Nadaraya & Watson Kernel) was used to estimate the non-parametric part ;in addition, the rule of thumb method was used to estimate the smoothing bandwidth (h). Penalty methods proved to be efficient in estimating the regression coefficients, but the SCAD method according to the mean squared error criterion (MSE) was the best after estimating the missing data using the mean imputation method