Feature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematically studied by exploring available studies of different metaheuristic algorithms used for FS to improve TC. This paper will contribute to the body of existing knowledge by answering four research questions (RQs): 1) What are the different approaches of FS that apply metaheuristic algorithms to improve TC? 2) Does applying metaheuristic algorithms for TC lead to better accuracy than the typical FS methods? 3) How effective are the modified, hybridized metaheuristic algorithms for text FS problems?, and 4) What are the gaps in the current studies and their future directions? These RQs led to a study of recent works on metaheuristic-based FS methods, their contributions, and limitations. Hence, a final list of thirty-seven (37) related articles was extracted and investigated to align with our RQs to generate new knowledge in the domain of study. Most of the conducted papers focused on addressing the TC in tandem with metaheuristic algorithms based on the wrapper and hybrid FS approaches. Future research should focus on using a hybrid-based FS approach as it intuitively handles complex optimization problems and potentiality provide new research opportunities in this rapidly developing field.
Coronavirus disease (COVID-19) is an acute disease that affects the respiratory system which initially appeared in Wuhan, China. In Feb 2019 the sickness began to spread swiftly throughout the entire planet, causing significant health, social, and economic problems. Time series is an important statistical method used to study and analyze a particular phenomenon, identify its pattern and factors, and use it to predict future values. The main focus of the research is to shed light on the study of SARIMA, NARNN, and hybrid models, expecting that the series comprises both linear and non-linear compounds, and that the ARIMA model can deal with the linear component and the NARNN model can deal with the non-linear component. The models
... Show MoreSeveral stress-strain models were used to predict the strengths of steel fiber reinforced concrete, which are distinctive of the material. However, insufficient research has been done on the influence of hybrid fiber combinations (comprising two or more distinct fibers) on the characteristics of concrete. For this reason, the researchers conducted an experimental program to determine the stress-strain relationship of 30 concrete samples reinforced with two distinct fibers (a hybrid of polyvinyl alcohol and steel fibers), with compressive strengths ranging from 40 to 120 MPa. A total of 80% of the experimental results were used to develop a new empirical stress-strain model, which was accomplished through the application of the parti
... Show MoreThe consumption of dried bananas has increased because they contain essential nutrients. In order to preserve bananas for a longer period, a drying process is carried out, which makes them a light snack that does not spoil quickly. On the other hand, machine learning algorithms can be used to predict the sweetness of dried bananas. The article aimed to study the effect of different drying times (6, 8, and 10 hours) using an air dryer on some physical and chemical characteristics of bananas, including CIE-L*a*b, water content, carbohydrates, and sweetness. Also predicting the sweetness of dried bananas based on the CIE-L*a*b ratios using machine learn- ing algorithms RF, SVM, LDA, KNN, and CART. The results showed that increasing the drying
... Show MoreAudio classification is the process to classify different audio types according to contents. It is implemented in a large variety of real world problems, all classification applications allowed the target subjects to be viewed as a specific type of audio and hence, there is a variety in the audio types and every type has to be treatedcarefully according to its significant properties.Feature extraction is an important process for audio classification. This workintroduces several sets of features according to the type, two types of audio (datasets) were studied. Two different features sets are proposed: (i) firstorder gradient feature vector, and (ii) Local roughness feature vector, the experimentsshowed that the results are competitive to
... Show Moreيعد هذا النص أحد النصوص المسمارية المصادرة التي بحوزة المتحف العراقي، ويحمل الرقم المتحفي (235869)، قياساته )12،7x 6x 2،5سم). يتضمن مدخولات كميات من الشعير،أرخ النص الى عصر أور الثالثة (2012-2004 ق.م) و يعود الى السنة الثالثة من حكم الملك أبي-سين (2028-2004 ق.م)،أن الشخصية الرئيسة في هذا النص هو)با-اَ-كا مسمن الماشية( من مدينة أري-ساكرك، ومقارنته مع النصوص المسمارية المنشورة التي تعود الى أرشيفه يبلغ عددها (196) نصاً تضمنت نشاطاته م
... Show MoreThe area of character recognition has received a considerable attention by researchers all over the world during the last three decades. However, this research explores best sets of feature extraction techniques and studies the accuracy of well-known classifiers for Arabic numeral using the Statistical styles in two methods and making comparison study between them. First method Linear Discriminant function that is yield results with accuracy as high as 90% of original grouped cases correctly classified. In the second method, we proposed algorithm, The results show the efficiency of the proposed algorithms, where it is found to achieve recognition accuracy of 92.9% and 91.4%. This is providing efficiency more than the first method.