With the proliferation of both Internet access and data traffic, recent breaches have brought into sharp focus the need for Network Intrusion Detection Systems (NIDS) to protect networks from more complex cyberattacks. To differentiate between normal network processes and possible attacks, Intrusion Detection Systems (IDS) often employ pattern recognition and data mining techniques. Network and host system intrusions, assaults, and policy violations can be automatically detected and classified by an Intrusion Detection System (IDS). Using Python Scikit-Learn the results of this study show that Machine Learning (ML) techniques like Decision Tree (DT), Naïve Bayes (NB), and K-Nearest Neighbor (KNN) can enhance the effectiveness of an Intrusion Detection System (IDS). Success is measured by a variety of metrics, including accuracy, precision, recall, F1-Score, and execution time. Applying feature selection approaches such as Analysis of Variance (ANOVA), Mutual Information (MI), and Chi-Square (Ch-2) reduced execution time, increased detection efficiency and accuracy, and boosted overall performance. All classifiers achieve the greatest performance with 99.99% accuracy and the shortest computation time of 0.0089 seconds while using ANOVA with 10% of features.
Some of the main challenges in developing an effective network-based intrusion detection system (IDS) include analyzing large network traffic volumes and realizing the decision boundaries between normal and abnormal behaviors. Deploying feature selection together with efficient classifiers in the detection system can overcome these problems. Feature selection finds the most relevant features, thus reduces the dimensionality and complexity to analyze the network traffic. Moreover, using the most relevant features to build the predictive model, reduces the complexity of the developed model, thus reducing the building classifier model time and consequently improves the detection performance. In this study, two different sets of select
... Show MoreHeart disease is a significant and impactful health condition that ranks as the leading cause of death in many countries. In order to aid physicians in diagnosing cardiovascular diseases, clinical datasets are available for reference. However, with the rise of big data and medical datasets, it has become increasingly challenging for medical practitioners to accurately predict heart disease due to the abundance of unrelated and redundant features that hinder computational complexity and accuracy. As such, this study aims to identify the most discriminative features within high-dimensional datasets while minimizing complexity and improving accuracy through an Extra Tree feature selection based technique. The work study assesses the efficac
... Show MoreMost intrusion detection systems are signature based that work similar to anti-virus but they are unable to detect the zero-day attacks. The importance of the anomaly based IDS has raised because of its ability to deal with the unknown attacks. However smart attacks are appeared to compromise the detection ability of the anomaly based IDS. By considering these weak points the proposed
system is developed to overcome them. The proposed system is a development to the well-known payload anomaly detector (PAYL). By
combining two stages with the PAYL detector, it gives good detection ability and acceptable ratio of false positive. The proposed system improve the models recognition ability in the PAYL detector, for a filtered unencrypt
This paper proposes two hybrid feature subset selection approaches based on the combination (union or intersection) of both supervised and unsupervised filter approaches before using a wrapper, aiming to obtain low-dimensional features with high accuracy and interpretability and low time consumption. Experiments with the proposed hybrid approaches have been conducted on seven high-dimensional feature datasets. The classifiers adopted are support vector machine (SVM), linear discriminant analysis (LDA), and K-nearest neighbour (KNN). Experimental results have demonstrated the advantages and usefulness of the proposed methods in feature subset selection in high-dimensional space in terms of the number of selected features and time spe
... Show MoreText Clustering consists of grouping objects of similar categories. The initial centroids influence operation of the system with the potential to become trapped in local optima. The second issue pertains to the impact of a huge number of features on the determination of optimal initial centroids. The problem of dimensionality may be reduced by feature selection. Therefore, Wind Driven Optimization (WDO) was employed as Feature Selection to reduce the unimportant words from the text. In addition, the current study has integrated a novel clustering optimization technique called the WDO (Wasp Swarm Optimization) to effectively determine the most suitable initial centroids. The result showed the new meta-heuristic which is WDO was employed as t
... Show MoreRecent research has shown that a Deoxyribonucleic Acid (DNA) has ability to be used to discover diseases in human body as its function can be used for an intrusion-detection system (IDS) to detect attacks against computer system and networks traffics. Three main factor influenced the accuracy of IDS based on DNA sequence, which is DNA encoding method, STR keys and classification method to classify the correctness of proposed method. The pioneer idea on attempt a DNA sequence for intrusion detection system is using a normal signature sequence with alignment threshold value, later used DNA encoding based cryptography, however the detection rate result is very low. Since the network traffic consists of 41 attributes, therefore we proposed the
... Show MoreSeveral Intrusion Detection Systems (IDS) have been proposed in the current decade. Most datasets which associate with intrusion detection dataset suffer from an imbalance class problem. This problem limits the performance of classifier for minority classes. This paper has presented a novel class imbalance processing technology for large scale multiclass dataset, referred to as BMCD. Our algorithm is based on adapting the Synthetic Minority Over-Sampling Technique (SMOTE) with multiclass dataset to improve the detection rate of minority classes while ensuring efficiency. In this work we have been combined five individual CICIDS2017 dataset to create one multiclass dataset which contains several types of attacks. To prove the eff
... Show More