Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
True random number generators are essential components for communications to be conconfidentially secured. In this paper a new method is proposed to generate random sequences of numbers based on the difference of the arrival times of photons detected in a coincidence window between two single-photon counting modules
In this study, we have created a new Arabic dataset annotated according to Ekman’s basic emotions (Anger, Disgust, Fear, Happiness, Sadness and Surprise). This dataset is composed from Facebook posts written in the Iraqi dialect. We evaluated the quality of this dataset using four external judges which resulted in an average inter-annotation agreement of 0.751. Then we explored six different supervised machine learning methods to test the new dataset. We used Weka standard classifiers ZeroR, J48, Naïve Bayes, Multinomial Naïve Bayes for Text, and SMO. We also used a further compression-based classifier called PPM not included in Weka. Our study reveals that the PPM classifier significantly outperforms other classifiers such as SVM and N
... Show Moreمفهوم معامل الارتباط كمقياس يربط بين متغيرين هذا يجلب انتباهنا إلى موضوع الإحصاء في كل المستويات. أكثر من ذلك هناك ثلاث نقاط خاصة هي اعتيادياً نشدد عليها كما يأتي:-
(1 معامل الارتباط هو الدليل المعياري والذي قيمته لا تعتمد على قياسات
المتغيرات الأصلية.
(2قيمته تقع في المدى] 1,1-[ .
&nb
... Show MoreAbstract
In this study, we compare between the autoregressive approximations (Yule-Walker equations, Least Squares , Least Squares ( forward- backword ) and Burg’s (Geometric and Harmonic ) methods, to determine the optimal approximation to the time series generated from the first - order moving Average non-invertible process, and fractionally - integrated noise process, with several values for d (d=0.15,0.25,0.35,0.45) for different sample sizes (small,median,large)for two processes . We depend on figure of merit function which proposed by author Shibata in 1980, to determine the theoretical optimal order according to min
... Show MoreThis research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB
... Show Moret:
The most famous thing a person does is talk. He loves and hates, and continues with it confirming relationships, and with it, too, comes out of disbelief into faith. Marry a word and separate with a word. He reaches the top of the heavens with a kind word, with which he will gain the pleasure of God, and the Lord of a word that the servant speaks to which God writes with our pleasure or throws him on his face in the fire. Emotions are inflamed, the United Nations is intensified with a word, and relations between states and war continue with a word.
What comes out of a person’s mouth is a translator that expresses the repository of his conscience and reveals the place of his bed, for it is evidence of
... Show MoreThis paper aims to improve the voltage profile using the Static Synchronous Compensator (STATCOM) in the power system in the Kurdistan Region for all weak buses. Power System Simulation studied it for Engineers (PSS\E) software version 33.0 to apply the Newton-Raphson (NR) method. All bus voltages were recorded and compared with the Kurdistan region grid index (0.95≤V ≤1.05), simulating the power system and finding the optimal size and suitable location of Static Synchronous Compensator (STATCOM)for bus voltage improvement at the weakest buses. It shows that Soran and New Koya substations are the best placement for adding STATCOM with the sizes 20 MVAR and 40 MVAR. After adding STATCOM with the sizes [20MVAR and 40MV
... Show MoreThis research aims to clarify the importance of an accounting information system that uses artificial intelligence to detect earnings manipulation. The research problem stems from the widespread manipulation of earning in economic entities, especially at the local level, exacerbated by the high financial and administrative corruption rates in Iraq due to fraudulent accounting practices. Since earning manipulation involves intentional fraudulent acts, it is necessary to implement preventive measures to detect and deter such practices. The main hypothesis of the research assumes that an accounting information system based on artificial intelligence cannot effectively detect the manipulation of profits in Iraqi economic entities. The researche
... Show MoreThe aim of the current research is to reveal the effect of using brain-based learning theory strategies on the achievement of Art Education students in the subject of Teaching Methods. The experimental design with two equal experimental and control groups was used. The experimental design with two independent and equal groups was used, and the total of the research sample was (60) male and female students, (30) male and female students represented the experimental group, and (30) male and female students represented the control group. The researcher prepared the research tool represented by the cognitive achievement test consisting of (20) questions, and it was characterized by honesty and reliability, and the experiment lasted (6) weeks
... Show More