Classification of imbalanced data is an important issue. Many algorithms have been developed for classification, such as Back Propagation (BP) neural networks, decision tree, Bayesian networks etc., and have been used repeatedly in many fields. These algorithms speak of the problem of imbalanced data, where there are situations that belong to more classes than others. Imbalanced data result in poor performance and bias to a class without other classes. In this paper, we proposed three techniques based on the Over-Sampling (O.S.) technique for processing imbalanced dataset and redistributing it and converting it into balanced dataset. These techniques are (Improved Synthetic Minority Over-Sampling Technique (Improved SMOTE), Borderline-SMOTE + Imbalanced Ratio(IR), Adaptive Synthetic Sampling (ADASYN) +IR) Algorithm, where the work these techniques are generate the synthetic samples for the minority class to achieve balance between minority and majority classes and then calculate the IR between classes of minority and majority. Experimental results show ImprovedSMOTE algorithm outperform the Borderline-SMOTE + IR and ADASYN + IR algorithms because it achieves a high balance between minority and majority classes.
In this paper, estimation of system reliability of the multi-components in stress-strength model R(s,k) is considered, when the stress and strength are independent random variables and follows the Exponentiated Weibull Distribution (EWD) with known first shape parameter θ and, the second shape parameter α is unknown using different estimation methods. Comparisons among the proposed estimators through Monte Carlo simulation technique were made depend on mean squared error (MSE) criteria
Because of the experience of the mixture problem of high correlation and the existence of linear MultiCollinearity between the explanatory variables, because of the constraint of the unit and the interactions between them in the model, which increases the existence of links between the explanatory variables and this is illustrated by the variance inflation vector (VIF), L-Pseudo component to reduce the bond between the components of the mixture.
To estimate the parameters of the mixture model, we used in our research the use of methods that increase bias and reduce variance, such as the Ridge Regression Method and the Least Absolute Shrinkage and Selection Operator (LASSO) method a
... Show MoreEDIRKTO, an Implicit Type Runge-Kutta Method of Diagonally Embedded pairs, is a novel approach presented in the paper that may be used to solve 4th-order ordinary differential equations of the form . There are two pairs of EDIRKTO, with three stages each: EDIRKTO4(3) and EDIRKTO5(4). The derivation techniques of the method indicate that the higher-order pair is more accurate, while the lower-order pair provides superior error estimates. Next, using these pairs as a basis, we developed variable step codes and applied them to a series of -order ODE problems. The numerical outcomes demonstrated how much more effective their approach is in reducing the quantity of function evaluations needed to resolve fourth-order ODE issues.
The agent-based modeling is currently utilized extensively to analyze complex systems. It supported such growth, because it was able to convey distinct levels of interaction in a complex detailed environment. Meanwhile, agent-based models incline to be progressively complex. Thus, powerful modeling and simulation techniques are needed to address this rise in complexity. In recent years, a number of platforms for developing agent-based models have been developed. Actually, in most of the agents, often discrete representation of the environment, and one level of interaction are presented, where two or three are regarded hardly in various agent-based models. The key issue is that modellers work in these areas is not assisted by simulation plat
... Show MoreThis paper presents a method to classify colored textural images of skin tissues. Since medical images havehighly heterogeneity, the development of reliable skin-cancer detection process is difficult, and a mono fractaldimension is not sufficient to classify images of this nature. A multifractal-based feature vectors are suggested hereas an alternative and more effective tool. At the same time multiple color channels are used to get more descriptivefeatures.Two multifractal based set of features are suggested here. The first set measures the local roughness property, whilethe second set measure the local contrast property.A combination of all the extracted features from the three colormodels gives a highest classification accuracy with 99.4
... Show MoreThe basic solution to overcome difficult issues related to huge size of digital images is to recruited image compression techniques to reduce images size for efficient storage and fast transmission. In this paper, a new scheme of pixel base technique is proposed for grayscale image compression that implicitly utilize hybrid techniques of spatial modelling base technique of minimum residual along with transformed technique of Discrete Wavelet Transform (DWT) that also impels mixed between lossless and lossy techniques to ensure highly performance in terms of compression ratio and quality. The proposed technique has been applied on a set of standard test images and the results obtained are significantly encourage compared with Joint P
... Show MoreImportance of Arabic language stemming algorithm is not less than that of other languages stemming in Information Retrieval (IR) field. Lots of algorithms for finding the Arabic root are available and they are mainly categorized under two approaches which are light (stem)-based approach and root-based approach. The latter approach is somehow better than the first approach. A new root-based stemmer is proposed and its performance is compared with Khoja stemmer which is the most efficient root-based stemmers. The accuracy ratio of the proposed stemmer is (99.7) with a difference (1.9) with Khoja stemmer.
Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship an
... Show MoreFuzzy Based Clustering for Grayscale Image Steganalysis