A principal problem of any internet user is the increasing number of spam, which became a great problem today. Therefore, spam filtering has become a research fo-cus that attracts the attention of several security researchers and practitioners. Spam filtering can be viewed as a two-class classification problem. To this end, this paper proposes a spam filtering approach based on Possibilistic c-Means (PCM) algorithm and weighted distance coined as (WFCM) that can efficiently distinguish between spam and legitimate email messages. The objective of the formulated fuzzy problem is to construct two fuzzy clusters: spam and email clusters. The weight assignment is set by information gain algorithm. Experimental results on spam based benchmark dataset reveal that proper setting of feature-weight can improve the performance of the proposed spam filtering approach. Furthermore, the proposed spam filtering ap-proach performance is better than PCM and Naïve Bayes filtering technique.
The huge amount of documents in the internet led to the rapid need of text classification (TC). TC is used to organize these text documents. In this research paper, a new model is based on Extreme Machine learning (EML) is used. The proposed model consists of many phases including: preprocessing, feature extraction, Multiple Linear Regression (MLR) and ELM. The basic idea of the proposed model is built upon the calculation of feature weights by using MLR. These feature weights with the extracted features introduced as an input to the ELM that produced weighted Extreme Learning Machine (WELM). The results showed a great competence of the proposed WELM compared to the ELM.
The expansion of web applications like e-commerce and other services yields an exponential increase in offers and choices in the web. From these needs, the recommender system applications have arisen. This research proposed a recommender system that uses user's reviews as implicit feedback to extract user preferences from their reviews to enhance personalization in addition to the explicit ratings. Diversity also improved by using k-furthest neighbor algorithm upon user's clusters. The system tested using Douban movie standard dataset from Kaggle, and show good performance.
The designer must find the optimum match between the object's technical and economic needs and the performance and production requirements of the various material options when choosing material for an engineering application. This study proposes an integrated (hybrid) strategy for selecting the optimal material for an engineering design depending on design requirements. The primary objective is to determine the best candidate material for the drone wings based on Ashby's performance indices and then rank the result using a grey relational technique with the entropy weight method. Aluminum alloys, titanium alloys, composites, and wood have been suggested as suitable materials for manufacturing drone wings. The requirement
... Show MoreIntrusion-detection systems (IDSs) aim at detecting attacks against computer systems and networks or, in general, against information systems. Most of the diseases in human body are discovered through Deoxyribonucleic Acid (DNA) investigations. In this paper, the DNA sequence is utilized for intrusion detection by proposing an approach to detect attacks in network. The proposed approach is a misuse intrusion detection that consists of three stages. First, a DNA sequence for a network traffic taken from Knowledge Discovery and Data mining (KDD Cup 99) is generated. Then, Teiresias algorithm, which is used to detect sequences in human DNA and assist researchers in decoding the human genome, is used to discover the Shortest Tandem Repeat (S
... Show MoreThe study presents the modification of the Broyden-Flecher-Goldfarb-Shanno (BFGS) update (H-Version) based on the determinant property of inverse of Hessian matrix (second derivative of the objective function), via updating of the vector s ( the difference between the next solution and the current solution), such that the determinant of the next inverse of Hessian matrix is equal to the determinant of the current inverse of Hessian matrix at every iteration. Moreover, the sequence of inverse of Hessian matrix generated by the method would never approach a near-singular matrix, such that the program would never break before the minimum value of the objective function is obtained. Moreover, the new modification of BFGS update (H-vers
... Show MoreProduction sites suffer from idle in marketing of their products because of the lack in the efficient systems that analyze and track the evaluation of customers to products; therefore some products remain untargeted despite their good quality. This research aims to build a modest model intended to take two aspects into considerations. The first aspect is diagnosing dependable users on the site depending on the number of products evaluated and the user's positive impact on rating. The second aspect is diagnosing products with low weights (unknown) to be generated and recommended to users depending on logarithm equation and the number of co-rated users. Collaborative filtering is one of the most knowledge discovery techniques used positive
... Show MoreFinding similarities in texts is important in many areas such as information retrieval, automated article scoring, and short answer categorization. Evaluating short answers is not an easy task due to differences in natural language. Methods for calculating the similarity between texts depend on semantic or grammatical aspects. This paper discusses a method for evaluating short answers using semantic networks to represent the typical (correct) answer and students' answers. The semantic network of nodes and relationships represents the text (answers). Moreover, grammatical aspects are found by measuring the similarity of parts of speech between the answers. In addition, finding hierarchical relationships between nodes in netwo
... Show MoreThe transition of customers from one telecom operator to another has a direct impact on the company's growth and revenue. Traditional classification algorithms fail to predict churn effectively. This research introduces a deep learning model for predicting customers planning to leave to another operator. The model works on a high-dimensional large-scale data set. The performance of the model was measured against other classification algorithms, such as Gaussian NB, Random Forrest, and Decision Tree in predicting churn. The evaluation was performed based on accuracy, precision, recall, F-measure, Area Under Curve (AUC), and Receiver Operating Characteristic (ROC) Curve. The proposed deep learning model performs better than othe
... Show More