A principal problem of any internet user is the increasing number of spam, which became a great problem today. Therefore, spam filtering has become a research fo-cus that attracts the attention of several security researchers and practitioners. Spam filtering can be viewed as a two-class classification problem. To this end, this paper proposes a spam filtering approach based on Possibilistic c-Means (PCM) algorithm and weighted distance coined as (WFCM) that can efficiently distinguish between spam and legitimate email messages. The objective of the formulated fuzzy problem is to construct two fuzzy clusters: spam and email clusters. The weight assignment is set by information gain algorithm. Experimental results on spam based benchmark dataset reveal that proper setting of feature-weight can improve the performance of the proposed spam filtering approach. Furthermore, the proposed spam filtering ap-proach performance is better than PCM and Naïve Bayes filtering technique.
Emails have proliferated in our ever-increasing communication, collaboration and
information sharing. Unfortunately, one of the main abuses lacking complete benefits of
this service is email spam (or shortly spam). Spam can easily bewilder system because
of its availability and duplication, deceiving solicitations to obtain private information.
The research community has shown an increasing interest to set up, adapt, maintain and
tune several spam filtering techniques for dealing with emails and identifying spam and
exclude it automatically without the interference of the email user. The contribution of
this paper is twofold. Firstly, to present how spam filtering methodology can be
constructed based on the concep
This research introduces a proposed hybrid Spam Filtering System (SFS) which consists of Ant Colony System (ACS), information gain (IG) and Naïve Bayesian (NB). The aim of the proposed hybrid spam filtering is to classify the e-mails with high accuracy. The hybrid spam filtering consists of three consequence stages. In the first stage, the information gain (IG) for each attributes (i.e. weight for each feature) is computed. Then, the Ant Colony System algorithm selects the best features that the most intrinsic correlated attributes in classification. Finally, the third stage is dedicated to classify the e-mail using Naïve Bayesian (NB) algorithm. The experiment is conducted on spambase dataset. The result shows that the accuracy of NB
... Show MoreE-mail is an efficient and reliable data exchange service. Spams are undesired e-mail messages which are randomly sent in bulk usually for commercial aims. Obfuscated image spamming is one of the new tricks to bypass text-based and Optical Character Recognition (OCR)-based spam filters. Image spam detection based on image visual features has the advantage of efficiency in terms of reducing the computational cost and improving the performance. In this paper, an image spam detection schema is presented. Suitable image processing techniques were used to capture the image features that can differentiate spam images from non-spam ones. Weighted k-nearest neighbor, which is a simple, yet powerful, machine learning algorithm, was used as a clas
... Show MoreIn this research two algorithms are applied, the first is Fuzzy C Means (FCM) algorithm and the second is hard K means (HKM) algorithm to know which of them is better than the others these two algorithms are applied on a set of data collected from the Ministry of Planning on the water turbidity of five areas in Baghdad to know which of these areas are less turbid in clear water to see which months during the year are less turbid in clear water in the specified area.
The current issues in spam email detection systems are directly related to spam email classification's low accuracy and feature selection's high dimensionality. However, in machine learning (ML), feature selection (FS) as a global optimization strategy reduces data redundancy and produces a collection of precise and acceptable outcomes. A black hole algorithm-based FS algorithm is suggested in this paper for reducing the dimensionality of features and improving the accuracy of spam email classification. Each star's features are represented in binary form, with the features being transformed to binary using a sigmoid function. The proposed Binary Black Hole Algorithm (BBH) searches the feature space for the best feature subsets,
... Show MoreThe influx of data in bioinformatics is primarily in the form of DNA, RNA, and protein sequences. This condition places a significant burden on scientists and computers. Some genomics studies depend on clustering techniques to group similarly expressed genes into one cluster. Clustering is a type of unsupervised learning that can be used to divide unknown cluster data into clusters. The k-means and fuzzy c-means (FCM) algorithms are examples of algorithms that can be used for clustering. Consequently, clustering is a common approach that divides an input space into several homogeneous zones; it can be achieved using a variety of algorithms. This study used three models to cluster a brain tumor dataset. The first model uses FCM, whic
... Show MoreIn this research two algorithms are applied, the first is Fuzzy C Means (FCM) algorithm and the second is hard K means (HKM) algorithm to know which of them is better than the others these two algorithms are applied on a set of data collected from the Ministry of Planning on the water turbidity of five areas in Baghdad to know which of these areas are less turbid in clear water to see which months during the year are less turbid in clear water in the specified area.
Recently, Image enhancement techniques can be represented as one of the most significant topics in the field of digital image processing. The basic problem in the enhancement method is how to remove noise or improve digital image details. In the current research a method for digital image de-noising and its detail sharpening/highlighted was proposed. The proposed approach uses fuzzy logic technique to process each pixel inside entire image, and then take the decision if it is noisy or need more processing for highlighting. This issue is performed by examining the degree of association with neighboring elements based on fuzzy algorithm. The proposed de-noising approach was evaluated by some standard images after corrupting them with impulse
... Show Moreconventional FCM algorithm does not fully utilize the spatial information in the image. In this research, we use a FCM algorithm that incorporates spatial information into the membership function for clustering. The spatial function is the summation of the membership functions in the neighborhood of each pixel under consideration. The advantages of the method are that it is less
sensitive to noise than other techniques, and it yields regions more homogeneous than those of other methods. This technique is a powerful method for noisy image segmentation.