Arabic text categorization for pattern recognitions is challenging. We propose for the first time a novel holistic method based on clustering for classifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Secondly, their structural and statistical features are obtained from sectioned portions. Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generation step are obtained from multiple runs of individual clustering methods for each distance measures. The best results are achieved when intensity, lines slope and their
A substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques.
... Show MoreThe present study aims to investigate the various request constructions used in Classical Arabic and Modern Arabic language by identifying the differences in their usage in these two different genres. Also, the study attempts to trace the cases of felicitous and infelicitous requests in the Arabic language. Methodologically, the current study employs a web-based corpus tool (Sketch Engine) to analyze different corpora: the first one is Classical Arabic, represented by King Saud University Corpus of Classical Arabic, while the second is The Arabic Web Corpus “arTenTen” representing Modern Arabic. To do so, the study relies on felicity conditions to qualitatively interpret the quantitative data, i.e., following a mixed mode method
... Show MoreIn the field of data security, the critical challenge of preserving sensitive information during its transmission through public channels takes centre stage. Steganography, a method employed to conceal data within various carrier objects such as text, can be proposed to address these security challenges. Text, owing to its extensive usage and constrained bandwidth, stands out as an optimal medium for this purpose. Despite the richness of the Arabic language in its linguistic features, only a small number of studies have explored Arabic text steganography. Arabic text, characterized by its distinctive script and linguistic features, has gained notable attention as a promising domain for steganographic ventures. Arabic text steganography harn
... Show MoreAn approach is depended in the recent years to distinguish any author or writer from other by analyzing his writings or essays. This is done by analyzing the syllables of writings of an author. The syllable is composed of two letters; therefore the words of the writing are fragmented to syllables and extract the most frequency syllables to become trait of that author. The research work depend on analyzed the frequency syllables in two cases, the first, when there is a space between the words, the second, when these spaces are ignored. The results is obtained from a program which scan the syllables in the text file, the performance is best in the first case since the sequence of the selected syllables is higher than the same syllables in
... Show MoreGender classification is a critical task in computer vision. This task holds substantial importance in various domains, including surveillance, marketing, and human-computer interaction. In this work, the face gender classification model proposed consists of three main phases: the first phase involves applying the Viola-Jones algorithm to detect facial images, which includes four steps: 1) Haar-like features, 2) Integral Image, 3) Adaboost Learning, and 4) Cascade Classifier. In the second phase, four pre-processing operations are employed, namely cropping, resizing, converting the image from(RGB) Color Space to (LAB) color space, and enhancing the images using (HE, CLAHE). The final phase involves utilizing Transfer lea
... Show MoreCompression for color image is now necessary for transmission and storage in the data bases since the color gives a pleasing nature and natural for any object, so three composite techniques based color image compression is implemented to achieve image with high compression, no loss in original image, better performance and good image quality. These techniques are composite stationary wavelet technique (S), composite wavelet technique (W) and composite multi-wavelet technique (M). For the high energy sub-band of the 3 rd level of each composite transform in each composite technique, the compression parameters are calculated. The best composite transform among the 27 types is the three levels of multi-wavelet transform (MMM) in M technique wh
... Show MoreCompression for color image is now necessary for transmission and storage in the data bases since the color gives a pleasing nature and natural for any object, so three composite techniques based color image compression is implemented to achieve image with high compression, no loss in original image, better performance and good image quality. These techniques are composite stationary wavelet technique (S), composite wavelet technique (W) and composite multi-wavelet technique (M). For the high energy sub-band of the 3rd level of each composite transform in each composite technique, the compression parameters are calculated. The best composite transform among the 27 types is the three levels of multi-wavelet
... Show More