Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research work in this topic. Finally, it compiles and classifies various objective functions, the core of the evolutionary algorithms, from the related collection of research papers. The paper ends up by addressing some important issues and challenges that can be subject of future work.
An accurate assessment of the pipes’ conditions is required for effective management of the trunk sewers. In this paper the semi-Markov model was developed and tested using the sewer dataset from the Zublin trunk sewer in Baghdad, Iraq, in order to evaluate the future performance of the sewer. For the development of this model the cumulative waiting time distribution of sewers was used in each condition that was derived directly from the sewer condition class and age data. Results showed that the semi-Markov model was inconsistent with the data by adopting ( 2 test) and also, showed that the error in prediction is due to lack of data on the sewer waiting times at each condition state which can be solved by using successive conditi
... Show MoreThis paper presents a parametric audio compression scheme intended for scalable audio coding applications, and is particularly well suited for operation at low rates, in the vicinity of 5 to 32 Kbps. The model consists of two complementary components: Sines plus Noise (SN). The principal component of the system is an. overlap-add analysis-by-synthesis sinusoidal model based on conjugate matching pursuits. Perceptual information about human hearing is explicitly included into the model by psychoacoustically weighting the pursuit metric. Once analyzed, SN parameters are efficiently quantized and coded. Our informal listening tests demonstrated that our coder gave competitive performance to the-state-of-the- art HelixTM Producer Plus 9 from
... Show MoreIn recent years, there has been expanding development in the vehicular part and the number of vehicles moving on the road in all the sections of the country. Vehicle number plate identification based on image processing is a dynamic area of this work; this technique is used for security purposes such as tracking of stolen cars and access control to restricted areas. The License Plate Recognition System (LPRS) exploits a digital camera to capture vehicle plate numbers is used as input to the proposed recognition system. Basically, the developing system is consist of three phases, vehicle license plate localization, character segmentation, and character recognition, the License Plate (LP) detection is presented using canny
... Show MoreAspect categorisation and its utmost importance in the eld of Aspectbased Sentiment Analysis (ABSA) has encouraged researchers to improve topic model performance for modelling the aspects into categories. In general, a majority of its current methods implement parametric models requiring a pre-determined number of topics beforehand. However, this is not e ciently undertaken with unannotated text data as they lack any class label. Therefore, the current work presented a novel non-parametric model drawing a number of topics based on the semantic association present between opinion-targets (i.e., aspects) and their respective expressed sentiments. The model incorporated the Semantic Association Rules (SAR) into the Hierarchical Dirichlet Proce
... Show MoreThere has been a growing interest in the use of chaotic techniques for enabling secure communication in recent years. This need has been motivated by the emergence of a number of wireless services which require the channel to provide very low bit error rates (BER) along with information security. This paper investigates the feasibility of using chaotic communications over Multiple-Input Multiple-Output (MIMO) channels by combining chaos modulation with a suitable Space Time Block Code (STBC). It is well known that the use of Chaotic Modulation techniques can enhance communication security. However, the performance of systems using Chaos modulation has been observed to be inferior in BER performance as compared to conventional communication
... Show MorePure and doped TiO 2 with Bi films are obtained by pulse laser deposition technique at RT under vacume 10-3 mbar, and the influence of Bi content on the photocvoltaic properties of TiO 2 hetrojunctions is studied. All the films display photovoltaic in the near visible region. A broad double peaks are observed around λ= 300nm for pure TiO 2 at RT in the spectral response of the photocurrent, which corresponds approximately to the absorption edge and this peak shift to higher wavelength (600 nm) when Bi content increase by 7% then decrease by 9%. The result is confirmed with the decreasing of the energy gap in optical properties. Also, the increasing is due to an increase in the amount of Bi content, and shifted to 400nm when annealed at 523
... Show MoreThis paper presents a new algorithm in an important research field which is the semantic word similarity estimation. A new feature-based algorithm is proposed for measuring the word semantic similarity for the Arabic language. It is a highly systematic language where its words exhibit elegant and rigorous logic. The score of sematic similarity between two Arabic words is calculated as a function of their common and total taxonomical features. An Arabic knowledge source is employed for extracting the taxonomical features as a set of all concepts that subsumed the concepts containing the compared words. The previously developed Arabic word benchmark datasets are used for optimizing and evaluating the proposed algorithm. In this paper,
... Show MoreThis paper interest to estimation the unknown parameters for generalized Rayleigh distribution model based on censored samples of singly type one . In this paper the probability density function for generalized Rayleigh is defined with its properties . The maximum likelihood estimator method is used to derive the point estimation for all unknown parameters based on iterative method , as Newton – Raphson method , then derive confidence interval estimation which based on Fisher information matrix . Finally , testing whether the current model ( GRD ) fits to a set of real data , then compute the survival function and hazard function for this real data.