Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
There is various human biometrics used nowadays, one of the most important of these biometrics is the face. Many techniques have been suggested for face recognition, but they still face a variety of challenges for recognizing faces in images captured in the uncontrolled environment, and for real-life applications. Some of these challenges are pose variation, occlusion, facial expression, illumination, bad lighting, and image quality. New techniques are updating continuously. In this paper, the singular value decomposition is used to extract the features matrix for face recognition and classification. The input color image is converted into a grayscale image and then transformed into a local ternary pattern before splitting the image into
... Show MoreThe inhibitory behavior of L-Cysteine (Cys) and its derivatives towards iron corrosion through density functional theory (DFT) was investigated. The current research study undertakes a rigorous evaluation of global as well as local reactivity descriptors of the Cys in protonated as well as neutral forms and the changes in reactivity after the combination of Cys into di- and tripeptides. The inhibitory effect of di- and tri-peptides increases since, in the molecular structure, the number of reaction centers increase. We computed the adsorption energies (Eads) and low energy complexes with most stability for the adsorption of small peptides and Cys amino acids onto the surfaces of Fe (1 1 1). We found that the adsorption of tri-peptides onto
... Show MoreA Strength Pareto Evolutionary Algorithm 2 (SPEA 2) approach for solving the multi-objective Environmental / Economic Power Dispatch (EEPD) problem is presented in this paper. In the past fuel cost consumption minimization was the aim (a single objective function) of economic power dispatch problem. Since the clean air act amendments have been applied to reduce SO2 and NOX emissions from power plants, the utilities change their strategies in order to reduce pollution and atmospheric emission as well, adding emission minimization as other objective function made economic power dispatch (EPD) a multi-objective problem having conflicting objectives. SPEA2 is the improved version of SPEA with better fitness assignment, density estimation, an
... Show MoreThis work implements an Electroencephalogram (EEG) signal classifier. The implemented method uses Orthogonal Polynomials (OP) to convert the EEG signal samples to moments. A Sparse Filter (SF) reduces the number of converted moments to increase the classification accuracy. A Support Vector Machine (SVM) is used to classify the reduced moments between two classes. The proposed method’s performance is tested and compared with two methods by using two datasets. The datasets are divided into 80% for training and 20% for testing, with 5 -fold used for cross-validation. The results show that this method overcomes the accuracy of other methods. The proposed method’s best accuracy is 95.6% and 99.5%, respectively. Finally, from the results, it
... Show More