Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
The Ge0.4Te0.6 alloy has been prepared. Thin films of Ge0.4Te0.6 has been prepared via a thermal evaporation method with 4000A thickness, and rate of deposition (4.2) A/sec at pressure 2x10-6 Torr. The A.C electrical conductivity of a-Ge0.4Te0.6 thin films has been studied as a function of frequency for annealing temperature within the range (423-623) K, the deduced exponent s values, was found to decrease with increasing of annealing temperature through the frequency of the range (102-106) Hz. It was found that, the correlated barrier hopping (CBH) is the dominant conduction mechanism. Values of dielectric constant ε1 and dielectric loss ε2 were found to decrease with frequency and increase with temperature. The activation energies have
... Show MoreBefore the unit environmental problems serious the issues of the environment and conservation of contemporary issues important in the developed and developing worlds, it was natural that leads increasing global awareness to alert a group of intellectuals, scientists and politicians to the seriousness of this problem and the call to take steps deeper and more comprehensive with respect to the environment humanitarian based on the study of the various elements of this environment and a greater understanding of the relationships among them, and on this basis, steps have been taken to target the environment and to identify problems and make efforts to achieve the goals I: stop the deterioration of the environment and the second impro
... Show More
Abstract
Friction stir welding is a relatively new joining process, which involves the joining of metals without fusion or filler materials. In this study, the effect of welding parameters on the mechanical properties of aluminum alloys AA2024-T351 joints produced by FSW was investigated.
Different ranges of welding parameters, as input factors, such as welding speed (6 - 34 mm/min) and rotational speed (725 - 1235 rpm) were used to obtain their influences on the main responses, in terms of elongation, tensile strength, and maximum bending force. Experimental measurements of main responses were taken and analyzed using DESIGN EXPERT 8 experimental design software which was used to develop t
... Show More500 samples of diarrhea stool were collected from different ages(less than 1year –upto30years) and for both genders from some patients in (Alwiya hospital for children, Al-kendi, central health public laboratory and some gavernarated labs) period(1/11/2009—1/10/2010). Kinds of bacteria and parasites agents were isolated and identified from patients with diarrhea. Nine species of gram negative bacteria from enterobacteriaceae were isolated, E. coli isolated are the higher ratio 4.8% of all, then Salmonella typhi4.6% while the lowest ratios is Citrobacterfreundii 0.4%, while the other identified species were be among the previous rotios. also Plesomonasshigelloides was isolated which concedride one of the bacterial local studies.many met
... Show MoreBackground: In the past, an association between Tuberculosis (TB) and Diabetes Mellitus (DM) was widely accepted, today the potential public health and clinical importance of this relationship seems to be largely ignored. The national clinical and policy guidance in the UK on the central of TB, for example, does not consider the relationship with DM.Objectives: To determine the risk of association between diabetes mellitus and pulmonary TB.Methods: A retrospective study conducted in Ibn Zuhr hospital for chest diseases from Jan 2008 – sep 2010 , included in the study 402 patients with TB divided into diabetic & non diabetic, 96 (23.8%) were diabetic while other 306 were TB not diabetic.Results: Risk of TB among DM patients were cle
... Show MoreMany academics have concentrated on applying machine learning to retrieve information from databases to enable researchers to perform better. A difficult issue in prediction models is the selection of practical strategies that yield satisfactory forecast accuracy. Traditional software testing techniques have been extended to testing machine learning systems; however, they are insufficient for the latter because of the diversity of problems that machine learning systems create. Hence, the proposed methodologies were used to predict flight prices. A variety of artificial intelligence algorithms are used to attain the required, such as Bayesian modeling techniques such as Stochastic Gradient Descent (SGD), Adaptive boosting (ADA), Deci
... Show MoreAbstract: Data mining is become very important at the present time, especially with the increase in the area of information it's became huge, so it was necessary to use data mining to contain them and using them, one of the data mining techniques are association rules here using the Pattern Growth method kind enhancer for the apriori. The pattern growth method depends on fp-tree structure, this paper presents modify of fp-tree algorithm called HFMFFP-Growth by divided dataset and for each part take most frequent item in fp-tree so final nodes for conditional tree less than the original fp-tree. And less memory space and time.
Roof in the Iraqi houses normally flattening by a concrete panel. This concrete panel has poor thermal properties. The usage of materials with low thermal conductivity and high specific heat gives a good improvements to the thermal properties of the concrete panel, thus, the indoor room temperature improves. A Mathcad program based on a mathematical model employing complex Fourier series built for a single room building. The model input data are the ambient temperature, solar radiation, and sol-air temperature, which have been treated as a periodic function of time. While, the room construction is constant due to their materials made of it, except the roof properties are taken as a variable generated practically from the
... Show More