Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
Generalized Additive Model has been considered as a multivariate smoother that appeared recently in Nonparametric Regression Analysis. Thus, this research is devoted to study the mixed situation, i.e. for the phenomena that changes its behaviour from linear (with known functional form) represented in parametric part, to nonlinear (with unknown functional form: here, smoothing spline) represented in nonparametric part of the model. Furthermore, we propose robust semiparametric GAM estimator, which compared with two other existed techniques.
Tigris River water that comes from Turkey represents the main water resource of this river in Iraq. The expansion in water river implementations has formed a source of trouble for the workers in the water resources management field in Iraqi. Unfortunately, there is no agreement between Iraq and Turkey till now to share the water of this international river. Consequently, the optimal operation of water resources systems, particularly a multi-objective, multi-reservoir, is of the most necessity at the present time.
In this research two approaches, were used the dynamic programming (DP) approach and simulation model to find the optimal monthly operation of Ilisu Dam (from an Iraqi point of view) through a comp
... Show MoreThe purpose of this research is to analyze the relationship between the emotional intelligence and the leadership personality of the managers . the research was tested at the college of administration and economics – university of Baghdad through applying it on a sample of (67) members and units of the college. a questionnaire was used as a major tool for collecting data and information . for the purpose of researching to conclusion, the research aimed to test two main hypotheses related to the correlation coefficient and the effect correlation between the two main variable of the research, some statistical techniques such as (the mean, student deviation, percentages, correlation coefficient spearman, simple regression) were us
... Show MoreThe following list comprises sixty-one species and subspecies of coccine¬llid beetles belonging to twenty-two genera distributed among six tribes in three subfamilies. All the species and subspecies have been recorded for Iraq. The categories have been arranged systematically according to Korschefsky's (1931) catalogue.
Equilibrium adsorption isotherm for the removal of trifluralin from aqueous solutions using ? –alumina clay has been studied. The result shows that the isotherms were S3 according Giels classification. The effects of various experimental parameters such as contact time, adsorbent dosage, effect of pH and temperature of trifluralin on the adsorption capacities have been investigated. The adsorption isotherms were obtained by obeying freundlich adsorption isotherm with (R2 = 0.91249-0.8149). The thermodynamic parameters have been calculated by using the adsorption process at five different temperature, the values of ?H, ?G and ?S were (_1.0625) kj. mol-1, (7.628 - 7.831) kj.mol-1 and (_2.7966 - _2.9162) kg.
... Show More
