Advances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship and meanings of words in the document. As a result the sparsity and semantic problem that is prevalent in textual document are not resolved. In this study, the problem of sparsity and semantic is reduced by proposing a graph based text representation method, namely dependency graph with the aim of improving the accuracy of document clustering. The dependency graph representation scheme is created through an accumulation of syntactic and semantic analysis. A sample of 20 news groups, dataset was used in this study. The text documents undergo pre-processing and syntactic parsing in order to identify the sentence structure. Then the semantic of words are modeled using dependency graph. The produced dependency graph is then used in the process of cluster analysis. K-means clustering technique was used in this study. The dependency graph based clustering result were compared with the popular text representation method, i.e. TFIDF and Ontology based text representation. The result shows that the dependency graph outperforms both TFIDF and Ontology based text representation. The findings proved that the proposed text representation method leads to more accurate document clustering results.
In this paper, we introduce a method to identify the text printed in Arabic, since the recognition of the printed text is very important in the applications of information technology, the Arabic language is among a group of languages with related characters such as the language of Urdu , Kurdish language , Persian language also the old Turkish language " Ottoman ", it is difficult to identify the related letter because it is in several cases, such as the beginning of the word has a shape and center of the word has a shape and the last word also has a form, either texts in languages where the characters are not connected, then the image of the letter one in any location in the word has been Adoption of programs ready for him A long time.&
... Show MoreThe article considers a creolized text as a means of modern communication, describing its key verbal and visual components; the relationship of concepts polycode and creolized text has been shown; the universal basic image features have been called; the following kinds of creolized texts have been distinguished; it has been proved that the effective means of attracting the attention of the addressee is the use of expressive font features, which are divided into two groups: topographics (mechanisms of varying of areal syntagmatic of a text) and supragraphcs (change of typeface of font).
Care and attention to the structure in the sixties of the last century replaced the mark, and if the structure of Ms. pampered in research and studies, it has become the mark is also a spoiled lady .. But the relationship between the structure and the mark was not a break and break, but the relationship of integration, His themes are structural analysis, and these are intellectual themes that can not be surpassed in contemporary research, especially since semiotics have emerged from the linguistic inflection.
We have tried to distinguish between text and speech, which is a daunting task, as it seems that whenever the difference between them is clear and clear, we come back to wonder whether the text is the same discourse, and is
... Show MoreThe sensitive and important data are increased in the last decades rapidly, since the tremendous updating of networking infrastructure and communications. to secure this data becomes necessary with increasing volume of it, to satisfy securing for data, using different cipher techniques and methods to ensure goals of security that are integrity, confidentiality, and availability. This paper presented a proposed hybrid text cryptography method to encrypt a sensitive data by using different encryption algorithms such as: Caesar, Vigenère, Affine, and multiplicative. Using this hybrid text cryptography method aims to make the encryption process more secure and effective. The hybrid text cryptography method depends on circular queue. Using circ
... Show MoreIn this paper, a method for hiding cipher text in an image file is introduced . The
proposed method is to hide the cipher text message in the frequency domain of the image.
This method contained two phases: the first is embedding phase and the second is extraction
phase. In the embedding phase the image is transformed from time domain to frequency
domain using discrete wavelet decomposition technique (Haar). The text message encrypted
using RSA algorithm; then Least Significant Bit (LSB) algorithm used to hide secret message
in high frequency. The proposed method is tested in different images and showed success in
hiding information according to the Peak Signal to Noise Ratio (PSNR) measure of the the
original ima
This article investigates Iraq wars presentation in literature and media. The first section investigates the case of the returnees from the war and their experience, their trauma and final presentation of that experience. The article also investigates how trauma and fear is depicted to create an optimized image and state of fear that could in turn show Iraqi society as a traumatized society. Critics such as Suzie Grogan believes that the concept of trauma could expand to influence societies rather than one individual after exposure to trauma of being involved in wars and different major conflicts. This is reflected in Iraq as a country that was subjected to six comprehensive conflicts in its recent history, i.e. less than half a century; th
... Show MoreThe tagged research problem (the outputs of the written text in conceptual art) dealt with a comparative analytical study in the concept of conceptual art trends (land art - body art - art - language).
The study consisted of four chapters. The first chapter dealt with the theoretical framework, which was represented in presenting (the research problem), which raised the following question: What is the role of the written text in the transformations of the conceptual arts?
The first chapter included (the importance of research) and (research objectives) seeking to conduct comparative research in the written text within the trends of conceptual art as a moving phenomenon in art, and to reveal the variable written text in the
... Show MoreLighting is a very important element of treatment if the color contains many imaging system (digital cameras) and the unit of light and the light within these units are not strong , but usefel when the light is low , in different lighting intensities conditions image quality will not persist good enough and image may become dark or slightly exposed to light which leads to lower the details in image where we can not modify contrast or light ness to compensate thr decrease without losing the light and dark deatials . So we went in this research to study the variation colored texts written on the painting and lighting cases of non –regular ( a few) and different distances . As the diversity of these texts written on the board a
... Show MoreIn this paper a hybrid system was designed for securing transformed or stored text messages(Arabic and english) by embedding the message in a colored image as a cover file depending on LSB (Least Significant Bit) algorithm in a dispersed way and employing Hill data encryption algorithm for encrypt message before being hidden, A key of 3x3 was used for encryption with inverse for decryption, The system scores a good result for PSNR rate ( 75-86) that differentiates according to length of message and image resolution