This paper presents a hybrid approach for solving null values problem; it hybridizes rough set theory with intelligent swarm algorithm. The proposed approach is a supervised learning model. A large set of complete data called learning data is used to find the decision rule sets that then have been used in solving the incomplete data problem. The intelligent swarm algorithm is used for feature selection which represents bees algorithm as heuristic search algorithm combined with rough set theory as evaluation function. Also another feature selection algorithm called ID3 is presented, it works as statistical algorithm instead of intelligent algorithm. A comparison between those two approaches is made in their performance for null values estimation through working with rough set theory. The results obtained from most code sets show that Bees algorithm better than ID3 in decreasing the number of extracted rules without affecting the accuracy and increasing the accuracy ratio of null values estimation, especially when the number of null values is increasing
This paper is concerned with introducing and studying the new approximation operators based on a finite family of d. g. 'swhich are the core concept in this paper. In addition, we study generalization of some Pawlak's concepts and we offer generalize the definition of accuracy measure of approximations by using a finite family of d. g. 's.
This research aims to estimate stock returns, according to the Rough Set Theory approach, test its effectiveness and accuracy in predicting stock returns and their potential in the field of financial markets, and rationalize investor decisions. The research sample is totaling (10) companies traded at Iraq Stock Exchange. The results showed a remarkable Rough Set Theory application in data reduction, contributing to the rationalization of investment decisions. The most prominent conclusions are the capability of rough set theory in dealing with financial data and applying it for forecasting stock returns.The research provides those interested in investing stocks in financial
... Show MoreBreast cancer was one of the most common reasons for death among the women in the world. Limited awareness of the seriousness of this disease, shortage number of specialists in hospitals and waiting the diagnostic for a long period time that might increase the probability of expansion the injury cases. Consequently, various machine learning techniques have been formulated to decrease the time taken of decision making for diagnoses the breast cancer and that might minimize the mortality rate. The proposed system consists of two phases. Firstly, data pre-processing (data cleaning, selection) of the data mining are used in the breast cancer dataset taken from the University of California, Irvine machine learning repository in this stage we
... Show MoreThe main focus of this article is to introduce the notion of rough pentapartitioned neutrosophic set and rough pentapartitioned neutrosophic topology by using rough pentapartitioned neutrosophic lower approximation, rough pentapartitioned neutrosophic upper approximation, and rough pentapartitioned neutrosophic boundary region. Then, we provide some basic properties, namely operations on rough pentapartitioned neutrosophic set and rough pentapartitioned neutrosophic topology. By defining rough pentapartitioned neutrosophic set and topology, we formulate some results in the form of theorems, propositions, etc. Further, we give some examples to justify the definitions introduced in this article.
Most of the medical datasets suffer from missing data, due to the expense of some tests or human faults while recording these tests. This issue affects the performance of the machine learning models because the values of some features will be missing. Therefore, there is a need for a specific type of methods for imputing these missing data. In this research, the salp swarm algorithm (SSA) is used for generating and imputing the missing values in the pain in my ass (also known Pima) Indian diabetes disease (PIDD) dataset, the proposed algorithm is called (ISSA). The obtained results showed that the classification performance of three different classifiers which are support vector machine (SVM), K-nearest neighbour (KNN), and Naïve B
... Show MoreThe emphasis of Master Production Scheduling (MPS) or tactic planning is on time and spatial disintegration of the cumulative planning targets and forecasts, along with the provision and forecast of the required resources. This procedure eventually becomes considerably difficult and slow as the number of resources, products and periods considered increases. A number of studies have been carried out to understand these impediments and formulate algorithms to optimise the production planning problem, or more specifically the master production scheduling (MPS) problem. These algorithms include an Evolutionary Algorithm called Genetic Algorithm, a Swarm Intelligence methodology called Gravitational Search Algorithm (GSA), Bat Algorithm (BAT), T
... Show MoreMetaheuristics under the swarm intelligence (SI) class have proven to be efficient and have become popular methods for solving different optimization problems. Based on the usage of memory, metaheuristics can be classified into algorithms with memory and without memory (memory-less). The absence of memory in some metaheuristics will lead to the loss of the information gained in previous iterations. The metaheuristics tend to divert from promising areas of solutions search spaces which will lead to non-optimal solutions. This paper aims to review memory usage and its effect on the performance of the main SI-based metaheuristics. Investigation has been performed on SI metaheuristics, memory usage and memory-less metaheuristics, memory char
... Show MoreFrequent data in weather records is essential for forecasting, numerical model development, and research, but data recording interruptions may occur for various reasons. So, this study aims to find a way to treat these missing data and know their accuracy by comparing them with the original data values. The mean method was used to treat daily and monthly missing temperature data. The results show that treating the monthly temperature data for the stations (Baghdad, Hilla, Basra, Nasiriya, and Samawa) in Iraq for all periods (1980-2020), the percentage for matching between the original and the treating values did not exceed (80%). So, the period was divided into four periods. It was noted that most of the congruence values increased, re
... Show MoreThe idea of carrying out research on incomplete data came from the circumstances of our dear country and the horrors of war, which resulted in the missing of many important data and in all aspects of economic, natural, health, scientific life, etc.,. The reasons for the missing are different, including what is outside the will of the concerned or be the will of the concerned, which is planned for that because of the cost or risk or because of the lack of possibilities for inspection. The missing data in this study were processed using Principal Component Analysis and self-organizing map methods using simulation. The variables of child health and variables affecting children's health were taken into account: breastfeed
... Show MoreIn this paper, ARIMA model was used for Estimating the missing data(air temperature, relative humidity, wind speed) for mean monthly variables in different time series at three stations (Sinjar, Baghdad , AL.Hai) which represented different parts of Iraq from north to south respectively