Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.
facing economic units operating in the environment sector of the Iraqi
industrial many pressures in its seeking to measure and evaluate its performance because of variables, today's corporate environment, as the case which makes looking for a methodology can be adopted to evaluate its performance with a more holistic, rather than being limited to traditional measures that are no longer enough to keep pace with rapid changes in today's corporate environment, which requires that measures of performance are derived from the strategy of unity and commensurate with the specificity of the environment in Iraq. Try searching discussion Ttormwhrat and performance measurement systems to suit the business strategies and directions of change
... Show MoreTo achieve safe security to transfer data from the sender to receiver, cryptography is one way that is used for such purposes. However, to increase the level of data security, DNA as a new term was introduced to cryptography. The DNA can be easily used to store and transfer the data, and it becomes an effective procedure for such aims and used to implement the computation. A new cryptography system is proposed, consisting of two phases: the encryption phase and the decryption phase. The encryption phase includes six steps, starting by converting plaintext to their equivalent ASCII values and converting them to binary values. After that, the binary values are converted to DNA characters and then converted to their equivalent complementary DN
... Show MoreEstimating the semantic similarity between short texts plays an increasingly prominent role in many fields related to text mining and natural language processing applications, especially with the large increase in the volume of textual data that is produced daily. Traditional approaches for calculating the degree of similarity between two texts, based on the words they share, do not perform well with short texts because two similar texts may be written in different terms by employing synonyms. As a result, short texts should be semantically compared. In this paper, a semantic similarity measurement method between texts is presented which combines knowledge-based and corpus-based semantic information to build a semantic network that repre
... Show MoreThe research deals with the concept of stigma as one of the important phenomena that cast a shadow over the nature of the individual, his being and his personality through the inferior view with which he confronts in society, and (Sartors) indicates in this regard that stigma may lead to negative discrimination that leads to many defects, in terms of obtaining On care, poor health, service, and frequent setbacks that can damage self-esteem. The first roots of this phenomenon go back to the Greek civilization and what the Greeks used to burn and cut off some parts of the body and then announce to the nation that the bearer of this sign is a criminal. In addition to the Arab peoples living from setbacks that contributed to the exacerbation
... Show MoreThe Character is one of the elements of Storytelling, as it is the center of the plot, making it the basis on which the talk is about. The talk is the portrayal of the character while they’re acting; the novelist presents the character by interacting with the events, and the extent of the negative and positive appearing impact on the character. It should be noted that everyone has two personalities or more, each one appearing in a different position or situation. For instance, a man can be a father, a lover, an employee, a son or anyone else .. in another position, he might be a master, and in another a looser begging for the mercy of his humiliator, and sometimes he can show weakness to the one he loves, or show strength to his enemie
... Show MoreBiosensor is defined as a device that transforms the interactions between bioreceptors and analytes into a logical signal proportional to the reactants' concentration. Biosensors have different applications that aim primarily to detect diseases, medicines, food safety, the proportion of toxins in water, and other applications that ensure the safety and health of the organism. The main challenge of biosensors is represented in the difficulty of obtaining sensors with accuracy, specific sensitivity, and repeatability for each use of the patient so that they give reliable results. The rapid diversification in biosensors is due to the accuracy of the techniques and materials used in the manufacturing process and the interrelationshi
... Show MoreKnowledge of the mineralogical composition of a petroleum reservoir's formation is crucial for the petrophysical evaluation of the reservoir. The Mishrif formation, which is prevalent in the Middle East, is renowned for its mineralogical complexity. Multi-mineral inversion, which combines multiple logs and inversions for multiple minerals at once, can make it easier to figure out what minerals are in the Mishrif Formation. This method could help identify minerals better and give more information about the minerals that make up the formation. In this study, an error model is used to find a link between the measurements of the tools and the petrophysical parameters. An error minimization procedure is subsequently applied to determine
... Show MoreEarth’s climate changes rapidly due to the increases in human demands and rapid economic growth. These changes will affect the entire biosphere, mostly in negative ways. Predicting future changes will put us in a better position to minimize their catastrophic effects and to understand how humans can cope with the new changes beforehand. In this research, previous global climate data set observations from 1961-1990 have been used to predict the future climate change scenario for 2010-2039. The data were processed with Idrisi Andes software and the final Köppen-Geiger map was created with ArcGIS software. Based on Köppen climate classification, it was found that areas of Equator, Arid Steppes, and Snow will decrease by 3.9 %, 2.96%, an
... Show MoreThis paper proposes a new approach, of Clustering Ultrasound images using the Hybrid Filter (CUHF) to determine the gender of the fetus in the early stages. The possible advantage of CUHF, a better result can be achieved when fuzzy c-mean FCM returns incorrect clusters. The proposed approach is conducted in two steps. Firstly, a preprocessing step to decrease the noise presented in ultrasound images by applying the filters: Local Binary Pattern (LBP), median, median and discrete wavelet (DWT),(median, DWT & LBP) and (median & Laplacian) ML. Secondly, implementing Fuzzy C-Mean (FCM) for clustering the resulted images from the first step. Amongst those filters, Median & Laplace has recorded a better accuracy. Our experimental evaluation on re
... Show More