A substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques. This study comprehensively analyzes different FS approaches based on optimization algorithms for TC. We begin by introducing the primary phases involved in implementing TC. Subsequently, we explore a wide range of FS approaches for categorizing text documents and attempt to organize the existing works into four fundamental approaches: filter, wrapper, hybrid, and embedded. Furthermore, we review four optimization algorithms utilized in solving text FS problems: swarm intelligence-based, evolutionary-based, physics-based, and human behavior-related algorithms. We discuss the advantages and disadvantages of state-of-the-art studies that employ optimization algorithms for text FS methods. Additionally, we consider several aspects of each proposed method and thoroughly discuss the challenges associated with datasets, FS approaches, optimization algorithms, machine learning classifiers, and evaluation criteria employed to assess new and existing techniques. Finally, by identifying research gaps and proposing future directions, our review provides valuable guidance to researchers in developing and situating further studies within the current body of literature.
An automatic text summarization system mimics how humans summarize by picking the most significant sentences in a source text. However, the complexities of the Arabic language have become challenging to obtain information quickly and effectively. The main disadvantage of the traditional approaches is that they are strictly constrained (especially for the Arabic language) by the accuracy of sentence feature functions, weighting schemes, and similarity calculations. On the other hand, the meta-heuristic search approaches have a feature tha
... Show MoreSecure information transmission over the internet is becoming an important requirement in data communication. These days, authenticity, secrecy, and confidentiality are the most important concerns in securing data communication. For that reason, information hiding methods are used, such as Cryptography, Steganography and Watermarking methods, to secure data transmission, where cryptography method is used to encrypt the information in an unreadable form. At the same time, steganography covers the information within images, audio or video. Finally, watermarking is used to protect information from intruders. This paper proposed a new cryptography method by using thre
... Show MoreThe drill bit is the most essential tool in drilling operation and optimum bit selection is one of the main challenges in planning and designing new wells. Conventional bit selections are mostly based on the historical performance of similar bits from offset wells. In addition, it is done by different techniques based on offset well logs. However, these methods are time consuming and they are not dependent on actual drilling parameters. The main objective of this study is to optimize bit selection in order to achieve maximum rate of penetration (ROP). In this work, a model that predicts the ROP was developed using artificial neural networks (ANNs) based on 19 input parameters. For the
COVID 19 has spread rapidly around the world due to the lack of a suitable vaccine; therefore the early prediction of those infected with this virus is extremely important attempting to control it by quarantining the infected people and giving them possible medical attention to limit its spread. This work suggests a model for predicting the COVID 19 virus using feature selection techniques. The proposed model consists of three stages which include the preprocessing stage, the features selection stage, and the classification stage. This work uses a data set consists of 8571 records, with forty features for patients from different countries. Two feature selection techniques are used in