Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.
Abstract
Digital repositories are considered one of the integrated collaborative educational environments that help every researcher interested in developing the education and educational process. The learning resources provided by the repositories are suitable for every researcher, so digital information can be stored and exchanged by ensuring the participation and cooperation of researchers, teachers, and those who are interested, as well as curricula experts, teachers, and students, to exchange each other’s experiences in constantly updating that information as a reason for developing their performance in education. This reveals the importance of the role of educational digital institutions by providing and
... Show MoreFormation evaluation is a critical process in the petroleum industry that involves assessing the petrophysical properties and hydrocarbon potential of subsurface rock formations. This study focuses on evaluating the Mauddad Formation in the Bai Hassan oil field by analyzing data obtained from well logs and core samples. Four wells were specifically chosen for this study (BH-102, BH-16, BH-86, and BH-93). The main objectives of this study were to identify the lithology of the Mauddud Formation and estimate key petrophysical properties such as shale volume, porosity, water saturation, and permeability. The Mauddud Formation primarily consists of limestone and dolomite, with some anhydrites present. It is classified as a clean for
... Show MoreThis study aims at making formation evaluation for Mishrif Formation in three wells within Noor Oilfield which are: No-1, No-2 and No-5. The study includes calculations of shale volume and porosity, water saturation using Archie method, measuring the bulk volume of water (BVW) and using Buckle plot, as well as measuring the movable and residual hydrocarbons. These calculations were carried out using Interactive Petrophysics (IP) version 3.5 software as well as using Petrel 2009 software for structural map construction and correlation purposes. It was found that the Mishrif Formation in Noor Oilfield is not at irreducible water saturation, though it is of good reservoir characteristics and hydrocarbon production especially at the upper pa
... Show MoreThe effect of different cutting fluids on surface roughness of brass alloy workpiece during turning operation was carried out in this research. This was performed with different cutting speed, while other cutting parameters had been regarded as constants(feeding rate , and depth of cut). Surface roughness of machined parts that will be tested by electronic surface roughness tester .The results show that the standard coolant gives the best values of surface roughness for fixed cutting speed ,followed by sun flower oil that has approximately the same effect, while the air stream as a coolant gave unsatisfied results for the evaluation of surface roughness.
In the other hand the best values of surface roughness were recorded for max
... Show MoreA proposed feature extraction algorithm for handwriting Arabic words. The proposed method uses a 4 levels discrete wavelet transform (DWT) on binary image. sliding window on wavelet space and computes the stander derivation for each window. The extracted features were classified with multiple Support Vector Machine (SVM) classifiers. The proposed method simulated with a proposed data set from different writers. The experimental results of the simulation show 94.44% recognition rate.
The aim of this work is to evaluate the one- electron expectation value from the radial electronic density function D(r1) for different wave function for the 2S state of Be atom . The wave function used were published in 1960,1974and 1993, respectavily. Using Hartree-Fock wave function as a Slater determinant has used the partitioning technique for the analysis open shell system of Be (1s22s2) state, the analyze Be atom for six-pairs electronic wave function , tow of these are for intra-shells (K,L) and the rest for inter-shells(KL) . The results are obtained numerically by using computer programs (Mathcad).
Objective: to assess the predictive value of Doppler imaging of the uterine artery in the identification of early intrauterine abnormal pregnancy as compared to a normal intrauterine pregnancy. Subjects and methods: one hundred and twenty pregnant ladies, at their 6-12 weeks of gestation, with a singleton pregnancy were included in this population-based case-control study. Thirty women with a missed miscarriage, 30 with hydatidiform mole, 30 with a blighted ovum, and 30 as a control group, without risk factors, underwent Doppler interrogation of the uterine arteries. Resistive index (RI), pulsatility index (PI), and the systolic/diastolic ratio (S/D) were measured for both sides. The t-test, or ANOVA test when appropriate, was
... Show MoreObjective: to assess the predictive value of Doppler imaging of the uterine artery in the identification of early intrauterine abnormal pregnancy as compared to a normal intrauterine pregnancy.
Subjects and methods: one hundred and twenty pregnant ladies, at their 6-12 weeks of gestation, with a singleton pregnancy were included in this population-based case-control study. Thirty women with a missed miscarriage, 30 with hydatidiform mole, 30 with a blighted ovum, and 30 as a control group, without risk factors, underwent Doppler interrogation of the uterine arteries. Resistive index (RI), pulsatility index (PI), and the systolic/diastolic ratio (S/D) were measured for both sides. The t-test, or ANOVA test when a
... Show More