Preferred Language
Articles
/
XheqWJMBVTCNdQwC4tGZ
Text classification based on optimization feature selection methods: a review and future directions

A substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques. This study comprehensively analyzes different FS approaches based on optimization algorithms for TC. We begin by introducing the primary phases involved in implementing TC. Subsequently, we explore a wide range of FS approaches for categorizing text documents and attempt to organize the existing works into four fundamental approaches: filter, wrapper, hybrid, and embedded. Furthermore, we review four optimization algorithms utilized in solving text FS problems: swarm intelligence-based, evolutionary-based, physics-based, and human behavior-related algorithms. We discuss the advantages and disadvantages of state-of-the-art studies that employ optimization algorithms for text FS methods. Additionally, we consider several aspects of each proposed method and thoroughly discuss the challenges associated with datasets, FS approaches, optimization algorithms, machine learning classifiers, and evaluation criteria employed to assess new and existing techniques. Finally, by identifying research gaps and proposing future directions, our review provides valuable guidance to researchers in developing and situating further studies within the current body of literature.

Scopus Crossref
View Publication Preview PDF
Quick Preview PDF
Publication Date
Sat Jan 01 2022
Journal Name
Ieee Access
Wrapper and Hybrid Feature Selection Methods Using Metaheuristic Algorithms for English Text Classification: A Systematic Review

Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematicall

... Show More
Scopus (27)
Crossref (21)
Scopus Clarivate Crossref
View Publication Preview PDF
Publication Date
Wed Dec 30 2020
Journal Name
Iraqi Journal Of Science
The Evaluation of Accuracy Performance in an Enhanced Embedded Feature Selection for Unstructured Text Classification

Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the te

... Show More
Scopus (8)
Crossref (7)
Scopus Crossref
View Publication Preview PDF
Publication Date
Tue Nov 26 2024
Journal Name
International Journal Of Data And Network Science
Multi-objective of wind-driven optimization as feature selection and clustering to enhance text clustering

Text Clustering consists of grouping objects of similar categories. The initial centroids influence operation of the system with the potential to become trapped in local optima. The second issue pertains to the impact of a huge number of features on the determination of optimal initial centroids. The problem of dimensionality may be reduced by feature selection. Therefore, Wind Driven Optimization (WDO) was employed as Feature Selection to reduce the unimportant words from the text. In addition, the current study has integrated a novel clustering optimization technique called the WDO (Wasp Swarm Optimization) to effectively determine the most suitable initial centroids. The result showed the new meta-heuristic which is WDO was employed as t

... Show More
Crossref (1)
Scopus Crossref
View Publication Preview PDF
Publication Date
Mon Oct 30 2023
Journal Name
Iraqi Journal Of Science
Review on Hybrid Swarm Algorithms for Feature Selection

    Feature selection represents one of the critical processes in machine learning (ML). The fundamental aim of the problem of feature selection is to maintain performance accuracy while reducing the dimension of feature selection. Different approaches were created for classifying the datasets. In a range of optimization problems, swarming techniques produced better outcomes. At the same time, hybrid algorithms have gotten a lot of attention recently when it comes to solving optimization problems. As a result, this study provides a thorough assessment of the literature on feature selection problems using hybrid swarm algorithms that have been developed over time (2018-2021). Lastly, when compared with current feature selection procedu

... Show More
Scopus Crossref
View Publication Preview PDF
Publication Date
Sun Jan 20 2019
Journal Name
Ibn Al-haitham Journal For Pure And Applied Sciences
Text Classification Based on Weighted Extreme Learning Machine

The huge amount of documents in the internet led to the rapid need of text classification (TC). TC is used to organize these text documents. In this research paper, a new model is based on Extreme Machine learning (EML) is used. The proposed model consists of many phases including: preprocessing, feature extraction, Multiple Linear Regression (MLR) and ELM. The basic idea of the proposed model is built upon the calculation of feature weights by using MLR. These feature weights with the extracted features introduced as an input to the ELM that produced weighted Extreme Learning Machine (WELM). The results showed   a great competence of the proposed WELM compared to the ELM. 

Crossref (3)
Crossref
View Publication Preview PDF
Publication Date
Thu Nov 30 2023
Journal Name
Iraqi Journal Of Science
Effect of Genetic Algorithm as a Feature Selection for Image Classification

     Analysis of image content is important in the classification of images, identification, retrieval, and recognition processes. The medical image datasets for content-based medical image retrieval (  are large datasets that are limited by high computational costs and poor performance. The aim of the proposed method is to enhance this image retrieval and classification by using a genetic algorithm (GA) to choose the reduced features and dimensionality. This process was created in three stages. In the first stage, two algorithms are applied to extract the important features; the first algorithm is the Contrast Enhancement method and the second is a Discrete Cosine Transform algorithm. In the next stage, we used datasets of the medi

... Show More
Scopus (3)
Crossref (1)
Scopus Crossref
View Publication Preview PDF
Publication Date
Sun Feb 27 2022
Journal Name
Iraqi Journal Of Science
A New Method in Feature Selection based on Deep Reinforcement Learning in Domain Adaptation

    In data mining and machine learning methods, it is traditionally assumed that training data, test data, and the data that will be processed in the future, should have the same feature space distribution. This is a condition that will not happen in the real world. In order to overcome this challenge, domain adaptation-based methods are used. One of the existing challenges in domain adaptation-based methods is to select the most efficient features so that they can also show the most efficiency in the destination database. In this paper, a new feature selection method based on deep reinforcement learning is proposed. In the proposed method, in order to select the best and most appropriate features, the essential policies

... Show More
Scopus (1)
Scopus Crossref
View Publication Preview PDF
Publication Date
Tue Aug 31 2021
Journal Name
Iraqi Journal Of Science
A Survey on Feature Selection Techniques using Evolutionary Algorithms

     Feature selection, a method of dimensionality reduction, is nothing but collecting a range of appropriate feature subsets from the total number of features. In this paper, a point by point explanation review about the feature selection in this segment preferred affairs and its appraisal techniques are discussed. I will initiate my conversation with a straightforward approach so that we consider taking care of features and preferred issues depending upon meta-heuristic strategy. These techniques help in obtaining the best highlight subsets. Thereafter, this paper discusses some system models that drive naturally from the environment are discussed and calculations are performed so that we can take care of the prefe

... Show More
Scopus (5)
Crossref (3)
Scopus Crossref
View Publication Preview PDF
Publication Date
Tue Jan 31 2023
Journal Name
International Journal Of Nonlinear Analysis And Applications
Survey on intrusion detection system based on analysis concept drift: Status and future directions

Nowadays, internet security is a critical concern; the One of the most difficult study issues in network security is "intrusion detection". Fight against external threats. Intrusion detection is a novel method of securing computers and data networks that are already in use. To boost the efficacy of intrusion detection systems, machine learning and deep learning are widely deployed. While work on intrusion detection systems is already underway, based on data mining and machine learning is effective, it requires to detect intrusions by training static batch classifiers regardless considering the time-varying features of a regular data stream. Real-world problems, on the other hand, rarely fit into models that have such constraints. Furthermor

... Show More
View Publication
Publication Date
Sun Jan 16 2022
Journal Name
Iraqi Journal Of Science
A Multi-Objective Evolutionary Algorithm based Feature Selection for Intrusion Detection

Nowad ays, with the development of internet communication that provides many facilities to the user leads in turn to growing unauthorized access. As a result, intrusion detection system (IDS) becomes necessary to provide a high level of security for huge amount of information transferred in the network to protect them from threats. One of the main challenges for IDS is the high dimensionality of the feature space and how the relevant features to distinguish the normal network traffic from attack network are selected. In this paper, multi-objective evolutionary algorithm with decomposition (MOEA/D) and MOEA/D with the injection of a proposed local search operator are adopted to solve the Multi-objective optimization (MOO) followed by Naï

... Show More
View Publication Preview PDF