Data mining is a data analysis process using software to find certain patterns or rules in a large amount of data, which is expected to provide knowledge to support decisions. However, missing value in data mining often leads to a loss of information. The purpose of this study is to improve the performance of data classification with missing values, precisely and accurately. The test method is carried out using the Car Evaluation dataset from the UCI Machine Learning Repository. RStudio and RapidMiner tools were used for testing the algorithm. This study will result in a data analysis of the tested parameters to measure the performance of the algorithm. Using test variations: performance at C5.0, C4.5, and k-NN at 0% missing rate, performance at C5.0, C4.5, and k-NN at 5–50% missing rate, performance at C5.0 + k-NNI, C4.5 + k-NNI, and k-NN + k-NNI classifier at 5–50% missing rate, and performance at C5.0 + CMI, C4.5 + CMI, and k-NN + CMI classifier at 5–50% missing rate, The results show that C5.0 with k-NNI produces better classification accuracy than other tested imputation and classification algorithms. For example, with 35% of the dataset missing, this method obtains 93.40% validation accuracy and 92% test accuracy. C5.0 with k-NNI also offers fast processing times compared with other methods.
Zubair Formation is one of the richest petroleum systems in Southern Iraq. This formation is composed mainly of sandstones interbedded with shale sequences, with minor streaks of limestone and siltstone. Borehole collapse is one of the most critical challenges that continuously appear in drilling and production operations. Problems associated with borehole collapse, such as tight hole while tripping, stuck pipe and logging tools, hole enlargement, poor log quality, and poor primary cement jobs, are the cause of the majority of the nonproductive time (NPT) in the Zubair reservoir developments. Several studies released models predicting the onset of borehole collapse and the amount of enlargement of the wellbore cross-section. However, assump
... Show MoreSoil compaction is one of the most harmful elements affecting soil structure, limiting plant growth and agricultural productivity. It is crucial to assess the degree of soil penetration resistance to discover solutions to the harmful consequences of compaction. In order to obtain the appropriate value, using soil cone penetration requires time and labor-intensive measurements. Currently, satellite technologies, electronic measurement control systems, and computer software help to measure soil penetration resistance quickly and easily within the precision agriculture applications approach. The quantitative relationships between soil properties and the factors affecting their diversity contribute to digital soil mapping. Digital soil maps use
... Show More<span>One of the main difficulties facing the certified documents documentary archiving system is checking the stamps system, but, that stamps may be contains complex background and surrounded by unwanted data. Therefore, the main objective of this paper is to isolate background and to remove noise that may be surrounded stamp. Our proposed method comprises of four phases, firstly, we apply k-means algorithm for clustering stamp image into a number of clusters and merged them using ISODATA algorithm. Secondly, we compute mean and standard deviation for each remaining cluster to isolate background cluster from stamp cluster. Thirdly, a region growing algorithm is applied to segment the image and then choosing the connected regi
... Show MoreUncompressed form of the digital images are needed a very large storage capacity amount, as a consequence requires large communication bandwidth for data transmission over the network. Image compression techniques not only minimize the image storage space but also preserve the quality of image. This paper reveal image compression technique which uses distinct image coding scheme based on wavelet transform that combined effective types of compression algorithms for further compression. EZW and SPIHT algorithms are types of significant compression techniques that obtainable for lossy image compression algorithms. The EZW coding is a worthwhile and simple efficient algorithm. SPIHT is an most powerful technique that utilize for image
... Show MoreOpenStreetMap (OSM), recognised for its current and readily accessible spatial database, frequently serves regions lacking precise data at the necessary granularity. Global collaboration among OSM contributors presents challenges to data quality and uniformity, exacerbated by the sheer volume of input and indistinct data annotation protocols. This study presents a methodological improvement in the spatial accuracy of OSM datasets centred over Baghdad, Iraq, utilising data derived from OSM services and satellite imagery. An analytical focus was placed on two geometric correction methods: a two-dimensional polynomial affine transformation and a two-dimensional polynomial conformal transformation. The former involves twelve coefficients for ad
... Show MoreThe emphasis of Master Production Scheduling (MPS) or tactic planning is on time and spatial disintegration of the cumulative planning targets and forecasts, along with the provision and forecast of the required resources. This procedure eventually becomes considerably difficult and slow as the number of resources, products and periods considered increases. A number of studies have been carried out to understand these impediments and formulate algorithms to optimise the production planning problem, or more specifically the master production scheduling (MPS) problem. These algorithms include an Evolutionary Algorithm called Genetic Algorithm, a Swarm Intelligence methodology called Gravitational Search Algorithm (GSA), Bat Algorithm (BAT), T
... Show MoreImage segmentation is a basic image processing technique that is primarily used for finding segments that form the entire image. These segments can be then utilized in discriminative feature extraction, image retrieval, and pattern recognition. Clustering and region growing techniques are the commonly used image segmentation methods. K-Means is a heavily used clustering technique due to its simplicity and low computational cost. However, K-Means results depend on the initial centres’ values which are selected randomly, which leads to inconsistency in the image segmentation results. In addition, the quality of the isolated regions depends on the homogeneity of the resulted segments. In this paper, an improved K-Means
... Show More