Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.
This paper aims to evaluate the reliability analysis for steel beam which represented by the probability of Failure and reliability index. Monte Carlo Simulation Method (MCSM) and First Order Reliability Method (FORM) will be used to achieve this issue. These methods need two samples for each behavior that want to study; the first sample for resistance (carrying capacity R), and second for load effect (Q) which are parameters for a limit state function. Monte Carlo method has been adopted to generate these samples dependent on the randomness and uncertainties in variables. The variables that consider are beam cross-section dimensions, material property, beam length, yield stress, and applied loads. Matlab software has be
... Show MoreDigital change detection is the process that helps in determining the changes associated with land use and land cover properties with reference to geo-registered multi temporal remote sensing data. In this research change detection techniques have been employed to detect the changes in marshes in south of Iraq for two period the first one from 1973 to 1984 and the other from 1973 to 2014 three satellite images had been captured by land sat in different period. Preprocessing such as geo-registered, rectification and mosaic process have been done to prepare the satellite images for monitoring process. supervised classification techniques such maximum likelihood classification has been used to classify the studied area, change detection aft
... Show MoreIn this work, satellite images for Razaza Lake and the surrounding area
district in Karbala province are classified for years 1990,1999 and
2014 using two software programming (MATLAB 7.12 and ERDAS
imagine 2014). Proposed unsupervised and supervised method of
classification using MATLAB software have been used; these are
mean value and Singular Value Decomposition respectively. While
unsupervised (K-Means) and supervised (Maximum likelihood
Classifier) method are utilized using ERDAS imagine, in order to get
most accurate results and then compare these results of each method
and calculate the changes that taken place in years 1999 and 2014;
comparing with 1990. The results from classification indicated that
ترجمۀ شعر به آهنگ موسیقی از شاهکارهای فکری که تولیدی علمی ترجمی می آراید به شمار میرود ، چیزی مورد نا راحتی ونومیدی نسبت به مترجم وجود ندارد ، اگر وی در این راه با تلاش کردنی سیر می رود تا ثمره های آن ترجمه می چیند .
روش پژوهشگر در آنچه از ترجمۀ ابیات شعر فارسی بر آمد ، روشی نوینی می داند که آن بر هماهنگی آواز الفاظ با یکدیگر اتکای می کند تا ترجمه دارای آوازی وهماهنگی ، به مرتبه ای موسیق
... Show MoreLandSat Satellite ETM+ image have been analyzed to detect the different depths of regions inside the Tigris river in order to detect the regions that need to remove sedimentation in Baghdad in Iraq Country. The scene consisted of six bands (without the thermal band), It was captured in March ٢٠٠١. The variance in depth is determined by applying the rationing technique on the bands ٣ and ٥. GIS ٩. ١ program is used to apply the rationing technique and determined the results.
Linear discriminant analysis and logistic regression are the most widely used in multivariate statistical methods for analysis of data with categorical outcome variables .Both of them are appropriate for the development of linear classification models .linear discriminant analysis has been that the data of explanatory variables must be distributed multivariate normal distribution. While logistic regression no assumptions on the distribution of the explanatory data. Hence ,It is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions.
In this paper we have been focus for the comparison between three forms for classification data belongs
... Show MoreThe aim of the research is to find out the availability of the requirements of applying the indicators of school performance system in the public schools in Mahayel Asir educational directorate through the school planning indicator, the safety and security indicator, the active learning indicator, the student guidance indicator and determining the existence of statistically significant differences between the responses of the research community according to the variable of (scientific qualification - years of work as a principal - training courses). The questionnaire was used as a tool for data collection from the research community, which consists of all the public schools’ principals (n=180) Mahayel Asir educational directorate
... Show More