Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
This research has come out with that, function-based responsibility accounting system has harmful side – effects preventing it of achieving its controlling objective, that is, goal congruence, which are due to its un integrated measures, its focus on measuring measurable behaviors while neglecting behaviors that are hardly measured, and its dependence on standard operating procedures.
In addition, the system hypotheses and measures are designed to fit previous business environment, not the current environment.
The research has also concluded that the suggestive model, that is, activity-based responsibility accounting is designed to get ride of harmful side – effects of functi
... Show MoreThis paper proposes a new approach, of Clustering Ultrasound images using the Hybrid Filter (CUHF) to determine the gender of the fetus in the early stages. The possible advantage of CUHF, a better result can be achieved when fuzzy c-mean FCM returns incorrect clusters. The proposed approach is conducted in two steps. Firstly, a preprocessing step to decrease the noise presented in ultrasound images by applying the filters: Local Binary Pattern (LBP), median, median and discrete wavelet (DWT), (median, DWT & LBP) and (median & Laplacian) ML. Secondly, implementing Fuzzy C-Mean (FCM) for clustering the resulted images from the first step. Amongst those filters, Median & Lap
One of the most important , compound which have active hydrogen is the compound possessing (thiol group) Biphenyl-4,4-dithiol is agood example utilized in a wide field for preparation mannich bases , avariety of new acetylenic mannich bases have been Synthesized and all proposed structure were Supported by FTIR , 1H – NMR, 13C-NMR , Elemental analysis and microbial study .
Nowadays, internet security is a critical concern; the One of the most difficult study issues in network security is "intrusion detection". Fight against external threats. Intrusion detection is a novel method of securing computers and data networks that are already in use. To boost the efficacy of intrusion detection systems, machine learning and deep learning are widely deployed. While work on intrusion detection systems is already underway, based on data mining and machine learning is effective, it requires to detect intrusions by training static batch classifiers regardless considering the time-varying features of a regular data stream. Real-world problems, on the other hand, rarely fit into models that have such constraints. Furthermor
... Show MoreSocial media is known as detectors platform that are used to measure the activities of the users in the real world. However, the huge and unfiltered feed of messages posted on social media trigger social warnings, particularly when these messages contain hate speech towards specific individual or community. The negative effect of these messages on individuals or the society at large is of great concern to governments and non-governmental organizations. Word clouds provide a simple and efficient means of visually transferring the most common words from text documents. This research aims to develop a word cloud model based on hateful words on online social media environment such as Google News. Several steps are involved including data acq
... Show MoreAbstract
Although the rapid development in reverse engineering techniques, 3D laser scanners can be considered the modern technology used to digitize the 3D objects, but some troubles may be associate this process due to the environmental noises and limitation of the used scanners. So, in the present paper a data pre-processing algorithm has been proposed to obtain the necessary geometric features and mathematical representation of scanned object from its point cloud which obtained using 3D laser scanner (Matter and Form) through isolating the noised points. The proposed algorithm based on continuous calculations of chord angle between each adjacent pair of points in point cloud. A MATLAB program has been built t
... Show MoreThis paper is focused on orthogonal function approximation technique FAT-based adaptive backstepping control of a geared DC motor coupled with a rotational mechanical component. It is assumed that all parameters of the actuator are unknown including the torque-current constant (i.e., unknown input coefficient) and hence a control system with three motor control modes is proposed: 1) motor torque control mode, 2) motor current control mode, and 3) motor voltage control mode. The proposed control algorithm is a powerful tool to control a dynamic system with an unknown input coefficient. Each uncertain parameter/term is represented by a linear combination of weighting and orthogonal basis function vectors. Chebyshev polynomial is used
... Show More