Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematicall
... Show MoreThis paper proposes two hybrid feature subset selection approaches based on the combination (union or intersection) of both supervised and unsupervised filter approaches before using a wrapper, aiming to obtain low-dimensional features with high accuracy and interpretability and low time consumption. Experiments with the proposed hybrid approaches have been conducted on seven high-dimensional feature datasets. The classifiers adopted are support vector machine (SVM), linear discriminant analysis (LDA), and K-nearest neighbour (KNN). Experimental results have demonstrated the advantages and usefulness of the proposed methods in feature subset selection in high-dimensional space in terms of the number of selected features and time spe
... Show MoreText based-image clustering (TBIC) is an insufficient approach for clustering related web images. It is a challenging task to abstract the visual features of images with the support of textual information in a database. In content-based image clustering (CBIC), image data are clustered on the foundation of specific features like texture, colors, boundaries, shapes. In this paper, an effective CBIC) technique is presented, which uses texture and statistical features of the images. The statistical features or moments of colors (mean, skewness, standard deviation, kurtosis, and variance) are extracted from the images. These features are collected in a one dimension array, and then genetic algorithm (GA) is applied for image clustering.
... Show MoreThe High Power Amplifiers (HPAs), which are used in wireless communication, are distinctly characterized by nonlinear properties. The linearity of the HPA can be accomplished by retreating an HPA to put it in a linear region on account of power performance loss. Meanwhile the Orthogonal Frequency Division Multiplex signal is very rough. Therefore, it will be required a large undo to the linear action area that leads to a vital loss in power efficiency. Thereby, back-off is not a positive solution. A Simplicial Canonical Piecewise-Linear (SCPWL) model based digital predistorters are widely employed to compensating the nonlinear distortion that introduced by a HPA component in OFDM technology. In this paper, the genetic al
... Show MoreVariable selection is an essential and necessary task in the statistical modeling field. Several studies have triedto develop and standardize the process of variable selection, but it isdifficultto do so. The first question a researcher needs to ask himself/herself what are the most significant variables that should be used to describe a given dataset’s response. In thispaper, a new method for variable selection using Gibbs sampler techniqueshas beendeveloped.First, the model is defined, and the posterior distributions for all the parameters are derived.The new variable selection methodis tested usingfour simulation datasets. The new approachiscompared with some existingtechniques: Ordinary Least Squared (OLS), Least Absolute Shrinkage
... Show MoreThis paper presents a hybrid genetic algorithm (hGA) for optimizing the maximum likelihood function ln(L(phi(1),theta(1)))of the mixed model ARMA(1,1). The presented hybrid genetic algorithm (hGA) couples two processes: the canonical genetic algorithm (cGA) composed of three main steps: selection, local recombination and mutation, with the local search algorithm represent by steepest descent algorithm (sDA) which is defined by three basic parameters: frequency, probability, and number of local search iterations. The experimental design is based on simulating the cGA, hGA, and sDA algorithms with different values of model parameters, and sample size(n). The study contains comparison among these algorithms depending on MSE value. One can conc
... Show More