Text categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy they got. Deep Learning (DL) and Machine Learning (ML) models were used to enhance text classification for Arabic language. Remarks for future work were concluded.
ST Alawi, NA Mustafa, Al-Mustansiriyah Journal of Science, 2013
Administrative procedures in various organizations produce numerous crucial records and data. These
records and data are also used in other processes like customer relationship management and accounting
operations.It is incredibly challenging to use and extract valuable and meaningful information from these data
and records because they are frequently enormous and continuously growing in size and complexity.Data
mining is the act of sorting through large data sets to find patterns and relationships that might aid in the data
analysis process of resolving business issues. Using data mining techniques, enterprises can forecast future
trends and make better business decisions.The Apriori algorithm has bee
Many authors investigated the problem of the early visibility of the new crescent moon after the conjunction and proposed many criteria addressing this issue in the literature. This article presented a proposed criterion for early crescent moon sighting based on a deep-learned pattern recognizer artificial neural network (ANN) performance. Moon sight datasets were collected from various sources and used to learn the ANN. The new criterion relied on the crescent width and the arc of vision from the edge of the crescent bright limb. The result of that criterion was a control value indicating the moon's visibility condition, which separated the datasets into four regions: invisible, telescope only, probably visible, and certai
... Show MoreA set of hydro treating experiments are carried out on vacuum gas oil in a trickle bed reactor to study the hydrodesulfurization and hydrodenitrogenation based on two model compounds, carbazole (non-basic nitrogen compound) and acridine (basic nitrogen compound), which are added at 0–200 ppm to the tested oil, and dibenzotiophene is used as a sulfur model compound at 3,000 ppm over commercial CoMo/ Al2O3 and prepared PtMo/Al2O3. The impregnation method is used to prepare (0.5% Pt) PtMo/Al2O3. The basic sites are found to be very small, and the two catalysts exhibit good metal support interaction. In the absence of nitrogen compounds over the tested catalysts in the trickle bed reactor at temperatures of 523 to 573 K, liquid hourly space v
... Show MoreAnomaly detection is still a difficult task. To address this problem, we propose to strengthen DBSCAN algorithm for the data by converting all data to the graph concept frame (CFG). As is well known that the work DBSCAN method used to compile the data set belong to the same species in a while it will be considered in the external behavior of the cluster as a noise or anomalies. It can detect anomalies by DBSCAN algorithm can detect abnormal points that are far from certain set threshold (extremism). However, the abnormalities are not those cases, abnormal and unusual or far from a specific group, There is a type of data that is do not happen repeatedly, but are considered abnormal for the group of known. The analysis showed DBSCAN using the
... Show Moreيعد هذا النص أحد النصوص المسمارية المصادرة التي بحوزة المتحف العراقي، ويحمل الرقم المتحفي (235869)، قياساته )12،7x 6x 2،5سم). يتضمن مدخولات كميات من الشعير،أرخ النص الى عصر أور الثالثة (2012-2004 ق.م) و يعود الى السنة الثالثة من حكم الملك أبي-سين (2028-2004 ق.م)،أن الشخصية الرئيسة في هذا النص هو)با-اَ-كا مسمن الماشية( من مدينة أري-ساكرك، ومقارنته مع النصوص المسمارية المنشورة التي تعود الى أرشيفه يبلغ عددها (196) نصاً تضمنت نشاطاته م
... Show MoreIn this paper, we investigate the automatic recognition of emotion in text. We perform experiments with a new method of classification based on the PPM character-based text compression scheme. These experiments involve both coarse-grained classification (whether a text is emotional or not) and also fine-grained classification such as recognising Ekman’s six basic emotions (Anger, Disgust, Fear, Happiness, Sadness, Surprise). Experimental results with three datasets show that the new method significantly outperforms the traditional word-based text classification methods. The results show that the PPM compression based classification method is able to distinguish between emotional and nonemotional text with high accuracy, between texts invo
... Show MoreAutism is a lifelong developmental deficit that affects how people perceive the world and interact with each others. An estimated one in more than 100 people has autism. Autism affects almost four times as many boys than girls. The commonly used tools for analyzing the dataset of autism are FMRI, EEG, and more recently "eye tracking". A preliminary study on eye tracking trajectories of patients studied, showed a rudimentary statistical analysis (principal component analysis) provides interesting results on the statistical parameters that are studied such as the time spent in a region of interest. Another study, involving tools from Euclidean geometry and non-Euclidean, the trajectory of eye patients also showed interesting results. In this
... Show More