Spelling correction is considered a challenging task for resource-scarce languages. The Arabic language is one of these resource-scarce languages, which suffers from the absence of a large spelling correction dataset, thus datasets injected with artificial errors are used to overcome this problem. In this paper, we trained the Text-to-Text Transfer Transformer (T5) model using artificial errors to correct Arabic soft spelling mistakes. Our T5 model can correct 97.8% of the artificial errors that were injected into the test set. Additionally, our T5 model achieves a character error rate (CER) of 0.77% on a set that contains real soft spelling mistakes. We achieved these results using a 4-layer T5 model trained with a 90% error injection rate, with a maximum sequence length of 300 characters.
The matter of handwritten text recognition is as yet a major challenge to mainstream researchers. A few ways deal with this challenge have been endeavored in the most recent years, for the most part concentrating on the English pre-printed or handwritten characters space. Consequently, the need to effort a research concerning to Arabic texts handwritten recognition. The Arabic handwriting presents unique technical difficulties because it is cursive, right to left in writing and the letters convert its shapes and structures when it is putted at initial, middle, isolation or at the end of words. In this study, the Arabic text recognition is developed and designed to recognize image of Arabic text/characters. The proposed model gets a single l
... Show MoreThis research is intended to high light the uses of political content in foreign Arabic / speaking websites, such as “ CNN “ and” Euro News“, The research problem stems from the main question: What is the nature of the use of the websites in the political content provided through them? A set of sub-questions that give the research aspects and aims to achieve a set of objectives , including the identification of topics that included , the political content provided through , the sample sites during the time period for analysis and determine that the study uses descriptive research based on the discovery of the researcher, describing it accurately and defining the relations between the components.
The research conducted the des
Traditionally, style is defined as the expressive, emotive or aesthetic emphasis added linguistically to the discourse with its meaning is the same. In the current study, however, style is defined as the linguistic choice that the language users can make for specific purposes.
This study, thus, aims at analyzing political Arabic and English speeches to find out whether there are differences of style between English and Arabic and whether the choices the language users make can show any traits of their psychological status.
To fulfill the above aims, the study hypothesizes that English and Arabic speeches can be analyzed stylistically and that there are stylistic difference
... Show MoreSentiment analysis refers to the task of identifying polarity of positive and negative for particular text that yield an opinion. Arabic language has been expanded dramatically in the last decade especially with the emergence of social websites (e.g. Twitter, Facebook, etc.). Several studies addressed sentiment analysis for Arabic language using various techniques. The most efficient techniques according to the literature were the machine learning due to their capabilities to build a training model. Yet, there is still issues facing the Arabic sentiment analysis using machine learning techniques. Such issues are related to employing robust features that have the ability to discrimina
... Show MoreLoanwords are the words transferred from one language to another, which become essential part of the borrowing language. The loanwords have come from the source language to the recipient language because of many reasons. Detecting these loanwords is complicated task due to that there are no standard specifications for transferring words between languages and hence low accuracy. This work tries to enhance this accuracy of detecting loanwords between Turkish and Arabic language as a case study. In this paper, the proposed system contributes to find all possible loanwords using any set of characters either alphabetically or randomly arranged. Then, it processes the distortion in the pronunciation, and solves the problem of the missing lette
... Show MoreDeep learning convolution neural network has been widely used to recognize or classify voice. Various techniques have been used together with convolution neural network to prepare voice data before the training process in developing the classification model. However, not all model can produce good classification accuracy as there are many types of voice or speech. Classification of Arabic alphabet pronunciation is a one of the types of voice and accurate pronunciation is required in the learning of the Qur’an reading. Thus, the technique to process the pronunciation and training of the processed data requires specific approach. To overcome this issue, a method based on padding and deep learning convolution neural network is proposed to
... Show More