Arabic text categorization for pattern recognitions is challenging. We propose for the first time a novel holistic method based on clustering for classifying Arabic writer. The categorization is accomplished stage-wise. Firstly, these document images are sectioned into lines, words, and characters. Secondly, their structural and statistical features are obtained from sectioned portions. Thirdly, F-Measure is used to evaluate the performance of the extracted features and their combination in different linkage methods for each distance measures and different numbers of groups. Finally, experiments are conducted on the standard KHATT dataset of Arabic handwritten text comprised of varying samples from 1000 writers. The results in the generation step are obtained from multiple runs of individual clustering methods for each distance measures. The best results are achieved when intensity, lines slope and their
Machine learning (ML) is a key component within the broader field of artificial intelligence (AI) that employs statistical methods to empower computers with the ability to learn and make decisions autonomously, without the need for explicit programming. It is founded on the concept that computers can acquire knowledge from data, identify patterns, and draw conclusions with minimal human intervention. The main categories of ML include supervised learning, unsupervised learning, semisupervised learning, and reinforcement learning. Supervised learning involves training models using labelled datasets and comprises two primary forms: classification and regression. Regression is used for continuous output, while classification is employed
... Show MoreThe fetal heart rate (FHR) signal processing based on Artificial Neural Networks (ANN),Fuzzy Logic (FL) and frequency domain Discrete Wavelet Transform(DWT) were analysis in order to perform automatic analysis using personal computers. Cardiotocography (CTG) is a primary biophysical method of fetal monitoring. The assessment of the printed CTG traces was based on the visual analysis of patterns that describing the variability of fetal heart rate signal. Fetal heart rate data of pregnant women with pregnancy between 38 and 40 weeks of gestation were studied. The first stage in the system was to convert the cardiotocograghy (CTG) tracing in to digital series so that the system can be analyzed ,while the second stage ,the FHR time series was t
... Show MoreNowadays, people's expression on the Internet is no longer limited to text, especially with the rise of the short video boom, leading to the emergence of a large number of modal data such as text, pictures, audio, and video. Compared to single mode data ,the multi-modal data always contains massive information. The mining process of multi-modal information can help computers to better understand human emotional characteristics. However, because the multi-modal data show obvious dynamic time series features, it is necessary to solve the dynamic correlation problem within a single mode and between different modes in the same application scene during the fusion process. To solve this problem, in this paper, a feature extraction framework of
... Show MoreAdvances in digital technology and the World Wide Web has led to the increase of digital documents that are used for various purposes such as publishing and digital library. This phenomenon raises awareness for the requirement of effective techniques that can help during the search and retrieval of text. One of the most needed tasks is clustering, which categorizes documents automatically into meaningful groups. Clustering is an important task in data mining and machine learning. The accuracy of clustering depends tightly on the selection of the text representation method. Traditional methods of text representation model documents as bags of words using term-frequency index document frequency (TFIDF). This method ignores the relationship an
... Show MoreText categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy th
... Show MoreThe huge amount of documents in the internet led to the rapid need of text classification (TC). TC is used to organize these text documents. In this research paper, a new model is based on Extreme Machine learning (EML) is used. The proposed model consists of many phases including: preprocessing, feature extraction, Multiple Linear Regression (MLR) and ELM. The basic idea of the proposed model is built upon the calculation of feature weights by using MLR. These feature weights with the extracted features introduced as an input to the ELM that produced weighted Extreme Learning Machine (WELM). The results showed a great competence of the proposed WELM compared to the ELM.
A substantial matter to confidential messages' interchange through the internet is transmission of information safely. For example, digital products' consumers and producers are keen for knowing those products are genuine and must be distinguished from worthless products. Encryption's science can be defined as the technique to embed the data in an images file, audio or videos in a style which should be met the safety requirements. Steganography is a portion of data concealment science that aiming to be reached a coveted security scale in the interchange of private not clear commercial and military data. This research offers a novel technique for steganography based on hiding data inside the clusters that resulted from fuzzy clustering. T
... Show MoreWireless sensor applications are susceptible to energy constraints. Most of the energy is consumed in communication between wireless nodes. Clustering and data aggregation are the two widely used strategies for reducing energy usage and increasing the lifetime of wireless sensor networks. In target tracking applications, large amount of redundant data is produced regularly. Hence, deployment of effective data aggregation schemes is vital to eliminate data redundancy. This work aims to conduct a comparative study of various research approaches that employ clustering techniques for efficiently aggregating data in target tracking applications as selection of an appropriate clustering algorithm may reflect positive results in the data aggregati
... Show More