Data mining is a data analysis process using software to find certain patterns or rules in a large amount of data, which is expected to provide knowledge to support decisions. However, missing value in data mining often leads to a loss of information. The purpose of this study is to improve the performance of data classification with missing values, precisely and accurately. The test method is carried out using the Car Evaluation dataset from the UCI Machine Learning Repository. RStudio and RapidMiner tools were used for testing the algorithm. This study will result in a data analysis of the tested parameters to measure the performance of the algorithm. Using test variations: performance at C5.0, C4.5, and k-NN at 0% missing rate, performance at C5.0, C4.5, and k-NN at 5–50% missing rate, performance at C5.0 + k-NNI, C4.5 + k-NNI, and k-NN + k-NNI classifier at 5–50% missing rate, and performance at C5.0 + CMI, C4.5 + CMI, and k-NN + CMI classifier at 5–50% missing rate, The results show that C5.0 with k-NNI produces better classification accuracy than other tested imputation and classification algorithms. For example, with 35% of the dataset missing, this method obtains 93.40% validation accuracy and 92% test accuracy. C5.0 with k-NNI also offers fast processing times compared with other methods.
Background: Many structural or functional abnormalities can impair the production of thyroid hormones and cause hypothyroidism.Objectives: to identify the main etiological causes of hypothyroidism among patients visiting Specialized Center for Diabetes and Endocrinology.Methods: This study was conducted in the Specialized Center for Diabetes and Endocrinology on 217 patients with proved hypothyroidism, from 2006 to 2008. Every patient was tested with thyroid function tests, Ultrasound examination, thyroid autoantibodies, fine needle aspiration, radiology of skull, isotopes scan, also checking adrenal and gonadal function. Results: Out of these 217 patients 120 patients have thyroiditis 33 patients had been undergone thyroidectomy. 39 pat
... Show MoreA series of batch demulsification runs were carried out to evaluate the final emulsified water content of emulsion samples after the exposure to microwave. An experimental study was conducted to evaluate the effects of a set of operating variables on the demulsification performance. Several microwave irradiation demulsification runs were carried out at different irradiation powers (700, 800, and 900 watt), using water-in-oil emulsion samples containing different water contents (20-80%, 30-70%, and 50-50%) and salt contents (10000, 20000, and 30000 ppm). It was found that the best separation efficiency was obtained at 900watt, 50% water content and 160 s of irradiation time. Experimental results showed that microwave radiation method can
... Show MoreIn the present work, different remote sensing techniques have been used to analyze remote sensing data spectrally using ENVI software. The majority of algorithms used in the Spectral Processing can be organized as target detection, change detection and classification. In this paper several methods of target detection have been studied such as matched filter and constrained energy minimization.
The water body mapping have been obtained and the results showed changes on the study area through the period 1995-2000. Also the results that obtained from applying constrained energy minimization were more accurate than other method comparing with the real situation.
Data compression offers an attractive approach to reducing communication costs using available bandwidth effectively. It makes sense to pursue research on developing algorithms that can most effectively use available network. It is also important to consider the security aspect of the data being transmitted is vulnerable to attacks. The basic aim of this work is to develop a module for combining the operation of compression and encryption on the same set of data to perform these two operations simultaneously. This is achieved through embedding encryption into compression algorithms since both cryptographic ciphers and entropy coders bear certain resemblance in the sense of secrecy. First in the secure compression module, the given text is p
... Show MoreThe current research aims to reveal the level of satisfaction of the mentors with the evaluation of their performance according to gender (male - female) and to formulate the predictive equation for the level of performance (dependent variable) from knowing the level of satisfaction with the evaluation (independent variable). (16 paragraphs) contains alternatives to the answer that measures the level of satisfaction (weak, medium, and high) (1,2,3), that is, with a hypothetical average of (32). It consisted of 100 educational counselors consisting of 45 males and 55 females, the results of the research concluded that the level of satisfaction with performance is below the mean when compared with the hypothetical average of the scale of s
... Show MoreData hiding is the process of encoding extra information in an image by making small modification to its pixels. To be practical, the hidden data must be perceptually invisible yet robust to common signal processing operations. This paper introduces a scheme for hiding a signature image that could be as much as 25% of the host image data and hence could be used both in digital watermarking as well as image/data hiding. The proposed algorithm uses orthogonal discrete wavelet transforms with two zero moments and with improved time localization called discrete slantlet transform for both host and signature image. A scaling factor ? in frequency domain control the quality of the watermarked images. Experimental results of signature image
... Show MoreThe paired sample t-test for testing the difference between two means in paired data is not robust against the violation of the normality assumption. In this paper, some alternative robust tests have been suggested by using the bootstrap method in addition to combining the bootstrap method with the W.M test. Monte Carlo simulation experiments were employed to study the performance of the test statistics of each of these three tests depending on type one error rates and the power rates of the test statistics. The three tests have been applied on different sample sizes generated from three distributions represented by Bivariate normal distribution, Bivariate contaminated normal distribution, and the Bivariate Exponential distribution.
With the development of communication technologies for mobile devices and electronic communications, and went to the world of e-government, e-commerce and e-banking. It became necessary to control these activities from exposure to intrusion or misuse and to provide protection to them, so it's important to design powerful and efficient systems-do-this-purpose. It this paper it has been used several varieties of algorithm selection passive immune algorithm selection passive with real values, algorithm selection with passive detectors with a radius fixed, algorithm selection with passive detectors, variable- sized intrusion detection network type misuse where the algorithm generates a set of detectors to distinguish the self-samples. Practica
... Show MoreThis paper proposes two hybrid feature subset selection approaches based on the combination (union or intersection) of both supervised and unsupervised filter approaches before using a wrapper, aiming to obtain low-dimensional features with high accuracy and interpretability and low time consumption. Experiments with the proposed hybrid approaches have been conducted on seven high-dimensional feature datasets. The classifiers adopted are support vector machine (SVM), linear discriminant analysis (LDA), and K-nearest neighbour (KNN). Experimental results have demonstrated the advantages and usefulness of the proposed methods in feature subset selection in high-dimensional space in terms of the number of selected features and time spe
... Show Moren this research, several estimators concerning the estimation are introduced. These estimators are closely related to the hazard function by using one of the nonparametric methods namely the kernel function for censored data type with varying bandwidth and kernel boundary. Two types of bandwidth are used: local bandwidth and global bandwidth. Moreover, four types of boundary kernel are used namely: Rectangle, Epanechnikov, Biquadratic and Triquadratic and the proposed function was employed with all kernel functions. Two different simulation techniques are also used for two experiments to compare these estimators. In most of the cases, the results have proved that the local bandwidth is the best for all the types of the kernel boundary func
... Show More