Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.
At present, smooth movement on the roads is a matter which is needed for each user. Many roads, especially in urban areas geometrically improved because of the number of vehicles increase from time to time.
In this research, Highway capacity software, HCS, 2000, will be adopted to determine the effectiveness of roundabout in terms of capacity of roundabout, delay and level of service of roundabout.
The results of the analysis indicated that the Ahmed Urabi roundabout operates under level of service F with an average control delay of 300 seconds per vehicle during the peak hours.
The through movements of Alkarrada- Aljadiriya direction (Major Direction) represent the heaviest traff
... Show MoreThe steganography (text in image hiding) methods still considered important issues to the researchers at the present time. The steganography methods were varied in its hiding styles from a simple to complex techniques that are resistant to potential attacks. In current research the attack on the host's secret text problem didn’t considered, but an improved text hiding within the image have highly confidential was proposed and implemented companied with a strong password method, so as to ensure no change will be made in the pixel values of the host image after text hiding. The phrase “highly confidential” denoted to the low suspicious it has been performed may be found in the covered image. The Experimental results show that the covere
... Show MoreE-mail is an efficient and reliable data exchange service. Spams are undesired e-mail messages which are randomly sent in bulk usually for commercial aims. Obfuscated image spamming is one of the new tricks to bypass text-based and Optical Character Recognition (OCR)-based spam filters. Image spam detection based on image visual features has the advantage of efficiency in terms of reducing the computational cost and improving the performance. In this paper, an image spam detection schema is presented. Suitable image processing techniques were used to capture the image features that can differentiate spam images from non-spam ones. Weighted k-nearest neighbor, which is a simple, yet powerful, machine learning algorithm, was used as a clas
... Show MoreThe Mishrif Formation is one of the most important geological formations in Iraq consisting of limestone, marl, and shale layers since it is one of the main oil producing reservoirs in the country, which contain a significant portion of Iraq's oil reserves. The formation has been extensively explored and developed by the Iraqi government and international oil companies, with many oil fields being developed within it. The accurate evaluation of the Mishrif formation is key to the successful exploitation of this field. However, its geological complexity poses significant challenges for oil production, requiring advanced techniques to accurately evaluate its petrophysical properties.
This study used advanced well-logging analysi
... Show MoreAs s widely use of exchanging private information in various communication applications, the issue to secure it became top urgent. In this research, a new approach to encrypt text message based on genetic algorithm operators has been proposed. The proposed approach follows a new algorithm of generating 8 bit chromosome to encrypt plain text after selecting randomly crossover point. The resulted child code is flipped by one bit using mutation operation. Two simulations are conducted to evaluate the performance of the proposed approach including execution time of encryption/decryption and throughput computations. Simulations results prove the robustness of the proposed approach to produce better performance for all evaluation metrics with res
... Show MoreDue to severe scouring, many bridges failed worldwide. Therefore, the safety of the existing bridge (after contrition) mainly depends on the continuous monitoring of local scour at the substructure. However, the bridge's safety before construction mainly depends on the consideration of local scour estimation at the bridge substructure. Estimating the local scour at the bridge piers is usually done using the available formulae. Almost all the formulae used in estimating local scour at the bridge piers were derived from laboratory data. It is essential to test the performance of proposed local scour formulae using field data. In this study, the performance of selected bridge scours estimation formulae was validated and sta
... Show MoreThis paper presents a method to classify colored textural images of skin tissues. Since medical images havehighly heterogeneity, the development of reliable skin-cancer detection process is difficult, and a mono fractaldimension is not sufficient to classify images of this nature. A multifractal-based feature vectors are suggested hereas an alternative and more effective tool. At the same time multiple color channels are used to get more descriptivefeatures.Two multifractal based set of features are suggested here. The first set measures the local roughness property, whilethe second set measure the local contrast property.A combination of all the extracted features from the three colormodels gives a highest classification accuracy with 99.4
... Show More