Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
The work includes synthesis and characterization of some new heterocyclic compounds, as flow: The compound (3) (5-(4-chlorophenyl) -2-hydrazinyl-1,3,4-oxadiazole was synthesized by using two methods; the first method includes the direct reaction between hydrazine hydrate 80% and 5-(4-chlorophenyl)-2- (ethylthio) 1,3,4-oxadiazole (1), the second method involves converting 5-(4-chlorophenyl)-1,3,4-oxadiazol-2-amine (2) to diazonium salt then reducing this salt to compound (3) by stannous chloride. Compound (3) was used as starting material for synthesizing several fused heterocyclic compounds. The compound 6-(4- chlorophenyl)[1,2.4] triazolo [3,4,b][1,3,4] oxadiazole-3-(2H) thione (compound 4) was synthesized from the reaction of compo
... Show MoreHuman identification is crucial in forensics for the investigation of large-scale disasters such as fires, epidemics, earthquakes, and tsunamis. Even though biometric identification using panoramic dental radiography (PDR) has been the subject of several studies in the literature, further study remains a necessary and challenging issue. In this research, a human identification system was developed based on a convolutional neural network (CNN) and contour transform (CT). The proposed system was implemented on a total of 1540 PDR from 302 individuals. The preprocessing applied to PDRs for enhancing and taking the Region of Interest (ROI). The features were extracted using CT transform. These features were fused with features extracted
... Show MoreMultilocus haplotype analysis of candidate variants with genome wide association studies (GWAS) data may provide evidence of association with disease, even when the individual loci themselves do not. Unfortunately, when a large number of candidate variants are investigated, identifying risk haplotypes can be very difficult. To meet the challenge, a number of approaches have been put forward in recent years. However, most of them are not directly linked to the disease-penetrances of haplotypes and thus may not be efficient. To fill this gap, we propose a mixture model-based approach for detecting risk haplotypes. Under the mixture model, haplotypes are clustered directly according to their estimated d
Intrusion detection system is an imperative role in increasing security and decreasing the harm of the computer security system and information system when using of network. It observes different events in a network or system to decide occurring an intrusion or not and it is used to make strategic decision, security purposes and analyzing directions. This paper describes host based intrusion detection system architecture for DDoS attack, which intelligently detects the intrusion periodically and dynamically by evaluating the intruder group respective to the present node with its neighbors. We analyze a dependable dataset named CICIDS 2017 that contains benign and DDoS attack network flows, which meets certifiable criteria and is ope
... Show MoreProtecting information sent through insecure internet channels is a significant challenge facing researchers. In this paper, we present a novel method for image data encryption that combines chaotic maps with linear feedback shift registers in two stages. In the first stage, the image is divided into two parts. Then, the locations of the pixels of each part are redistributed through the random numbers key, which is generated using linear feedback shift registers. The second stage includes segmenting the image into the three primary colors red, green, and blue (RGB); then, the data for each color is encrypted through one of three keys that are generated using three-dimensional chaotic maps. Many statistical tests (entropy, peak signa
... Show MoreThis research depends on the relationship between the reflected spectrum, the nature of each target, area and the percentage of its presence with other targets in the unity of the target area. The changes occur in Land cover have been detected for different years using satellite images based on the Modified Spectral Angle Mapper (MSAM) processing, where Landsat satellite images are utilized using two software programming (MATLAB 7.11 and ERDAS imagine 2014). The proposed supervised classification method (MSAM) using a MATLAB program with supervised classification method (Maximum likelihood Classifier) by ERDAS imagine have been used to get farthest precise results and detect environmental changes for periods. Despite using two classificatio
... Show MoreIn recent years, observed focus greatly on gold nanoparticles synthesis due to its unique properties and tremendous applicability. In most of these researches, the citrate reduction method has been adopted. The aim of this study was to prepare and optimize monodisperse ultrafine particles by addition of reducing agent to gold salt, as a result of seed mediated growth mechanism. In this research, gold nanoparticles suspension (G) was prepared by traditional standard Turkevich method and optimized by studying different variables such as reactants concentrations, preparation temperature and stirring rate on controlling size and uniformity of nanoparticles through preparing twenty formulas (G1-G20). Subsequently, the selected formula that pr
... Show More