Arabic text categorization for pattern recognitions is challenging. We propose for the first time a novel holistic method based on clustering for classifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Secondly, their structural and statistical features are obtained from sectioned portions. Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generation step are obtained from multiple runs of individual clustering methods for each distance measures. The best results are achieved when intensity, lines slope and their
A substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques.
... Show MoreIntrusion detection systems (IDS) are useful tools that help security administrators in the developing task to secure the network and alert in any possible harmful event. IDS can be classified either as misuse or anomaly, depending on the detection methodology. Where Misuse IDS can recognize the known attack based on their signatures, the main disadvantage of these systems is that they cannot detect new attacks. At the same time, the anomaly IDS depends on normal behaviour, where the main advantage of this system is its ability to discover new attacks. On the other hand, the main drawback of anomaly IDS is high false alarm rate results. Therefore, a hybrid IDS is a combination of misuse and anomaly and acts as a solution to overcome the dis
... Show MoreIn the field of data security, the critical challenge of preserving sensitive information during its transmission through public channels takes centre stage. Steganography, a method employed to conceal data within various carrier objects such as text, can be proposed to address these security challenges. Text, owing to its extensive usage and constrained bandwidth, stands out as an optimal medium for this purpose. Despite the richness of the Arabic language in its linguistic features, only a small number of studies have explored Arabic text steganography. Arabic text, characterized by its distinctive script and linguistic features, has gained notable attention as a promising domain for steganographic ventures. Arabic text steganography harn
... Show MoreThe present study aims to investigate the various request constructions used in Classical Arabic and Modern Arabic language by identifying the differences in their usage in these two different genres. Also, the study attempts to trace the cases of felicitous and infelicitous requests in the Arabic language. Methodologically, the current study employs a web-based corpus tool (Sketch Engine) to analyze different corpora: the first one is Classical Arabic, represented by King Saud University Corpus of Classical Arabic, while the second is The Arabic Web Corpus “arTenTen” representing Modern Arabic. To do so, the study relies on felicity conditions to qualitatively interpret the quantitative data, i.e., following a mixed mode method
... Show MoreGender classification is a critical task in computer vision. This task holds substantial importance in various domains, including surveillance, marketing, and human-computer interaction. In this work, the face gender classification model proposed consists of three main phases: the first phase involves applying the Viola-Jones algorithm to detect facial images, which includes four steps: 1) Haar-like features, 2) Integral Image, 3) Adaboost Learning, and 4) Cascade Classifier. In the second phase, four pre-processing operations are employed, namely cropping, resizing, converting the image from(RGB) Color Space to (LAB) color space, and enhancing the images using (HE, CLAHE). The final phase involves utilizing Transfer lea
... Show MoreAn approach is depended in the recent years to distinguish any author or writer from other by analyzing his writings or essays. This is done by analyzing the syllables of writings of an author. The syllable is composed of two letters; therefore the words of the writing are fragmented to syllables and extract the most frequency syllables to become trait of that author. The research work depend on analyzed the frequency syllables in two cases, the first, when there is a space between the words, the second, when these spaces are ignored. The results is obtained from a program which scan the syllables in the text file, the performance is best in the first case since the sequence of the selected syllables is higher than the same syllables in
... Show More