Preferred Language
Articles
/
XheqWJMBVTCNdQwC4tGZ
Text classification based on optimization feature selection methods: a review and future directions

A substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques. This study comprehensively analyzes different FS approaches based on optimization algorithms for TC. We begin by introducing the primary phases involved in implementing TC. Subsequently, we explore a wide range of FS approaches for categorizing text documents and attempt to organize the existing works into four fundamental approaches: filter, wrapper, hybrid, and embedded. Furthermore, we review four optimization algorithms utilized in solving text FS problems: swarm intelligence-based, evolutionary-based, physics-based, and human behavior-related algorithms. We discuss the advantages and disadvantages of state-of-the-art studies that employ optimization algorithms for text FS methods. Additionally, we consider several aspects of each proposed method and thoroughly discuss the challenges associated with datasets, FS approaches, optimization algorithms, machine learning classifiers, and evaluation criteria employed to assess new and existing techniques. Finally, by identifying research gaps and proposing future directions, our review provides valuable guidance to researchers in developing and situating further studies within the current body of literature.

Scopus Crossref
View Publication Preview PDF
Quick Preview PDF
Publication Date
Wed Sep 23 2020
Journal Name
Artificial Intelligence Research
Hybrid approaches to feature subset selection for data classification in high-dimensional feature space

This paper proposes two hybrid feature subset selection approaches based on the combination (union or intersection) of both supervised and unsupervised filter approaches before using a wrapper, aiming to obtain low-dimensional features with high accuracy and interpretability and low time consumption. Experiments with the proposed hybrid approaches have been conducted on seven high-dimensional feature datasets. The classifiers adopted are support vector machine (SVM), linear discriminant analysis (LDA), and K-nearest neighbour (KNN). Experimental results have demonstrated the advantages and usefulness of the proposed methods in feature subset selection in high-dimensional space in terms of the number of selected features and time spe

... Show More
Crossref
View Publication
Publication Date
Mon Jul 01 2024
Journal Name
Journal Of Engineering
Efficient Intrusion Detection Through the Fusion of AI Algorithms and Feature Selection Methods

With the proliferation of both Internet access and data traffic, recent breaches have brought into sharp focus the need for Network Intrusion Detection Systems (NIDS) to protect networks from more complex cyberattacks. To differentiate between normal network processes and possible attacks, Intrusion Detection Systems (IDS) often employ pattern recognition and data mining techniques. Network and host system intrusions, assaults, and policy violations can be automatically detected and classified by an Intrusion Detection System (IDS). Using Python Scikit-Learn the results of this study show that Machine Learning (ML) techniques like Decision Tree (DT), Naïve Bayes (NB), and K-Nearest Neighbor (KNN) can enhance the effectiveness of an Intrusi

... Show More
Crossref
View Publication Preview PDF
Publication Date
Sun Jan 30 2022
Journal Name
Iraqi Journal Of Science
A Survey on Arabic Text Classification Using Deep and Machine Learning Algorithms

    Text categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accu

... Show More
Scopus (8)
Crossref (4)
Scopus Crossref
View Publication Preview PDF
Publication Date
Sun Feb 25 2024
Journal Name
Baghdad Science Journal
Exploring Important Factors in Predicting Heart Disease Based on Ensemble- Extra Feature Selection Approach

Heart disease is a significant and impactful health condition that ranks as the leading cause of death in many countries. In order to aid physicians in diagnosing cardiovascular diseases, clinical datasets are available for reference. However, with the rise of big data and medical datasets, it has become increasingly challenging for medical practitioners to accurately predict heart disease due to the abundance of unrelated and redundant features that hinder computational complexity and accuracy. As such, this study aims to identify the most discriminative features within high-dimensional datasets while minimizing complexity and improving accuracy through an Extra Tree feature selection based technique. The work study assesses the efficac

... Show More
Scopus (1)
Scopus Crossref
View Publication Preview PDF
Publication Date
Sun Apr 30 2023
Journal Name
Iraqi Journal Of Science
A Genetic Based Optimization Model for Extractive Multi-Document Text Summarization

Extractive multi-document text summarization – a summarization with the aim of removing redundant information in a document collection while preserving its salient sentences – has recently enjoyed a large interest in proposing automatic models. This paper proposes an extractive multi-document text summarization model based on genetic algorithm (GA). First, the problem is modeled as a discrete optimization problem and a specific fitness function is designed to effectively cope with the proposed model. Then, a binary-encoded representation together with a heuristic mutation and a local repair operators are proposed to characterize the adopted GA. Experiments are applied to ten topics from Document Understanding Conference DUC2002 datas

... Show More
View Publication Preview PDF
Publication Date
Wed Apr 20 2022
Journal Name
Periodicals Of Engineering And Natural Sciences (pen)
Text image secret sharing with hiding based on color feature

Scopus (1)
Crossref (1)
Scopus Crossref
View Publication
Publication Date
Thu Aug 30 2018
Journal Name
Iraqi Journal Of Science
Image Feature Extraction and Selection

Features are the description of the image contents which could be corner, blob or edge. Scale-Invariant Feature Transform (SIFT) extraction and description patent algorithm used widely in computer vision, it is fragmented to four main stages. This paper introduces image feature extraction using SIFT and chooses the most descriptive features among them by blurring image using Gaussian function and implementing Otsu segmentation algorithm on image, then applying Scale-Invariant Feature Transform feature extraction algorithm on segmented portions. On the other hand the SIFT feature extraction algorithm preceded by gray image normalization and binary thresholding as another preprocessing step. SIFT is a strong algorithm and gives more accura

... Show More
View Publication Preview PDF
Publication Date
Mon Jan 30 2023
Journal Name
Iraqi Journal Of Science
Scene Text Recognition: A Review

      The problem of text recognition and its applicability as part of images captured in the wild has gained a significant attention from the computer vision community in recent years. In contrast to the recognition of printed documents, scene text recognition is a difficult problem. Contrary to recognition of printed documents, recognizing a scene text is a challenging problem. Many researches focus on the problem of recognizing text extracted from natural scene images. Significant attempts have been made to address this problem in recent past. However, many of these attempts work on utilizing availability of strong context, which naturally limits the dictionary. This paper presents a review of recent papers related to scene text

... Show More
Scopus Crossref
View Publication Preview PDF
Publication Date
Tue Dec 05 2023
Journal Name
Baghdad Science Journal
AlexNet-Based Feature Extraction for Cassava Classification: A Machine Learning Approach

Cassava, a significant crop in Africa, Asia, and South America, is a staple food for millions. However, classifying cassava species using conventional color, texture, and shape features is inefficient, as cassava leaves exhibit similarities across different types, including toxic and non-toxic varieties. This research aims to overcome the limitations of traditional classification methods by employing deep learning techniques with pre-trained AlexNet as the feature extractor to accurately classify four types of cassava: Gajah, Manggu, Kapok, and Beracun. The dataset was collected from local farms in Lamongan Indonesia. To collect images with agricultural research experts, the dataset consists of 1,400 images, and each type of cassava has

... Show More
Scopus (1)
Crossref (1)
Scopus Crossref
View Publication Preview PDF
Publication Date
Tue Jun 01 2021
Journal Name
Swarm And Evolutionary Computation
A review of heuristics and metaheuristics for community detection in complex networks: Current usage, emerging development and future directions

Scopus (46)
Crossref (35)
Scopus Clarivate Crossref
View Publication