Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research work in this topic. Finally, it compiles and classifies various objective functions, the core of the evolutionary algorithms, from the related collection of research papers. The paper ends up by addressing some important issues and challenges that can be subject of future work.
Finding the shortest route in wireless mesh networks is an important aspect. Many techniques are used to solve this problem like dynamic programming, evolutionary algorithms, weighted-sum techniques, and others. In this paper, we use dynamic programming techniques to find the shortest path in wireless mesh networks due to their generality, reduction of complexity and facilitation of numerical computation, simplicity in incorporating constraints, and their onformity to the stochastic nature of some problems. The routing problem is a multi-objective optimization problem with some constraints such as path capacity and end-to-end delay. Single-constraint routing problems and solutions using Dijkstra, Bellman-Ford, and Floyd-Warshall algorith
... Show MoreMultilocus haplotype analysis of candidate variants with genome wide association studies (GWAS) data may provide evidence of association with disease, even when the individual loci themselves do not. Unfortunately, when a large number of candidate variants are investigated, identifying risk haplotypes can be very difficult. To meet the challenge, a number of approaches have been put forward in recent years. However, most of them are not directly linked to the disease-penetrances of haplotypes and thus may not be efficient. To fill this gap, we propose a mixture model-based approach for detecting risk haplotypes. Under the mixture model, haplotypes are clustered directly according to their estimated d
The load shedding scheme has been extensively implemented as a fast solution for unbalance conditions. Therefore, it's crucial to investigate supply-demand balancing in order to protect the network from collapsing and to sustain stability as possible, however its implementation is mostly undesirable. One of the solutions to minimize the amount of load shedding is the integration renewable energy resources, such as wind power, in the electric power generation could contribute significantly to minimizing power cuts as it is ability to positively improving the stability of the electric grid. In this paper propose a method for shedding the load base on the priority demands with incorporating the wind po
... Show MoreText categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy th
... Show MoreGeneral medical fields and computer science usually conjugate together to produce impressive results in both fields using applications, programs and algorithms provided by Data mining field. The present research's title contains the term hygiene which may be described as the principle of maintaining cleanliness of the external body. Whilst the environmental hygienic hazards can present themselves in various media shapes e.g. air, water, soil…etc. The influence they can exert on our health is very complex and may be modulated by our genetic makeup, psychological factors and by our perceptions of the risks that they present. Our main concern in this research is not to improve general health, rather than to propose a data mining approach
... Show MoreData generated from modern applications and the internet in healthcare is extensive and rapidly expanding. Therefore, one of the significant success factors for any application is understanding and extracting meaningful information using digital analytics tools. These tools will positively impact the application's performance and handle the challenges that can be faced to create highly consistent, logical, and information-rich summaries. This paper contains three main objectives: First, it provides several analytics methodologies that help to analyze datasets and extract useful information from them as preprocessing steps in any classification model to determine the dataset characteristics. Also, this paper provides a comparative st
... Show MoreRetinopathy of prematurity (ROP) can cause blindness in premature neonates. It is diagnosed when new blood vessels form abnormally in the retina. However, people at high risk of ROP might benefit significantly from early detection and treatment. Therefore, early diagnosis of ROP is vital in averting visual impairment. However, due to a lack of medical experience in detecting this condition, many people refuse treatment; this is especially troublesome given the rising cases of ROP. To deal with this problem, we trained three transfer learning models (VGG-19, ResNet-50, and EfficientNetB5) and a convolutional neural network (CNN) to identify the zones of ROP in preterm newborns. The dataset to train th
The continuous increases in the size of current telecommunication infrastructures have led to the many challenges that existing algorithms face in underlying optimization. The unrealistic assumptions and low efficiency of the traditional algorithms make them unable to solve large real-life problems at reasonable times.
The use of approximate optimization techniques, such as adaptive metaheuristic algorithms, has become more prevalent in a diverse research area. In this paper, we proposed the use of a self-adaptive differential evolution (jDE) algorithm to solve the radio network planning (RNP) problem in the context of the upcoming generation 5G. The experimental results prove the jDE with best vecto