Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research work in this topic. Finally, it compiles and classifies various objective functions, the core of the evolutionary algorithms, from the related collection of research papers. The paper ends up by addressing some important issues and challenges that can be subject of future work.
Several methods have been developed for routing problem in MANETs wireless network, because it considered very important problem in this network ,we suggested proposed method based on modified radial basis function networks RBFN and Kmean++ algorithm. The modification in RBFN for routing operation in order to find the optimal path between source and destination in MANETs clusters. Modified Radial Based Neural Network is very simple, adaptable and efficient method to increase the life time of nodes, packet delivery ratio and the throughput of the network will increase and connection become more useful because the optimal path has the best parameters from other paths including the best bitrate and best life link with minimum delays. The re
... Show MoreArabic text categorization for pattern recognitions is challenging. We propose for the first time a novel holistic method based on clustering for classifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Secondly, their structural and statistical features are obtained from sectioned portions. Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generatio
... Show MoreAlthough text document images authentication is difficult due to the binary nature and clear separation between the background and foreground but it is getting higher demand for many applications. Most previous researches in this field depend on insertion watermark in the document, the drawback in these techniques lie in the fact that changing pixel values in a binary document could introduce irregularities that are very visually noticeable. In this paper, a new method is proposed for object-based text document authentication, in which I propose a different approach where a text document is signed by shifting individual words slightly left or right from their original positions to make the center of gravity for each line fall in with the m
... Show MoreUntil recently, researchers have utilized and applied various techniques for intrusion detection system (IDS), including DNA encoding and clustering that are widely used for this purpose. In addition to the other two major techniques for detection are anomaly and misuse detection, where anomaly detection is done based on user behavior, while misuse detection is done based on known attacks signatures. However, both techniques have some drawbacks, such as a high false alarm rate. Therefore, hybrid IDS takes advantage of combining the strength of both techniques to overcome their limitations. In this paper, a hybrid IDS is proposed based on the DNA encoding and clustering method. The proposed DNA encoding is done based on the UNSW-NB15
... Show MoreProgression in Computer networks and emerging of new technologies in this field helps to find out new protocols and frameworks that provides new computer network-based services. E-government services, a modernized version of conventional government, are created through the steady evolution of technology in addition to the growing need of societies for numerous services. Government services are deeply related to citizens’ daily lives; therefore, it is important to evolve with technological developments—it is necessary to move from the traditional methods of managing government work to cutting-edge technical approaches that improve the effectiveness of government systems for providing services to citizens. Blockchain technology is amon
... Show MoreA substantial matter to confidential messages' interchange through the internet is transmission of information safely. For example, digital products' consumers and producers are keen for knowing those products are genuine and must be distinguished from worthless products. Encryption's science can be defined as the technique to embed the data in an images file, audio or videos in a style which should be met the safety requirements. Steganography is a portion of data concealment science that aiming to be reached a coveted security scale in the interchange of private not clear commercial and military data. This research offers a novel technique for steganography based on hiding data inside the clusters that resulted from fuzzy clustering. T
... Show MoreToday with increase using social media, a lot of researchers have interested in topic extraction from Twitter. Twitter is an unstructured short text and messy that it is critical to find topics from tweets. While topic modeling algorithms such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are originally designed to derive topics from large documents such as articles, and books. They are often less efficient when applied to short text content like Twitter. Luckily, Twitter has many features that represent the interaction between users. Tweets have rich user-generated hashtags as keywords. In this paper, we exploit the hashtags feature to improve topics learned
Among the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On the
... Show More