Doubts arise about the originality of a document when noticing a change in its writing style. This evidence to plagiarism has made the intrinsic approach for detecting plagiarism uncover the plagiarized passages through the analysis of the writing style for the suspicious document where a reference corpus to compare with is absent. The proposed work aims at discovering the deviations in document writing style through applying several steps: Firstly, the entire document is segmented into disjointed segments wherein each corresponds to a paragraph in the original document. For the entire document and for each segment, center vectors comprising average weight of their word are constructed. Second, the degree of closeness is calculated through applying Cosine similarity to measure for each segment, the deviation of its center vector from the center vector of the entire document. Additionally, word n-gram length will be investigated to show its effect on the proposed system performance wherein, center vectors are computed considering word n-grams for different values of n (n= 1, 2, and 3). Performance evaluation of the proposed method was accomplished through the use of Precision, Recall, F-measure, Granularity, and Plagdet as evaluation measures. Moreover, PAN-PC-09 and PAN-PC-11 were used for detecting intrinsic plagiarism as evaluation corpora. It is shown that the proposed approach has achieved results that are comparable to the state-of-the-art methods. Positive impact was observed through discovering deviations in document writing style by computing weight vectors dissimilarity rather than calculating the difference between the word n-grams that exist in segments and their corresponding word n-grams in the suspicious document. Furthermore, when considering the length of word n-gram, better results were recorded for system performance when word bi-grams was used compared to word uni-grams and word tri-grams.
The speaker identification is one of the fundamental problems in speech processing and voice modeling. The speaker identification applications include authentication in critical security systems and the accuracy of the selection. Large-scale voice recognition applications are a major challenge. Quick search in the speaker database requires fast, modern techniques and relies on artificial intelligence to achieve the desired results from the system. Many efforts are made to achieve this through the establishment of variable-based systems and the development of new methodologies for speaker identification. Speaker identification is the process of recognizing who is speaking using the characteristics extracted from the speech's waves like pi
... Show MoreThe study aimed to design a test of pre-writing skills for public kindergartens in Baghdad city. The test consisted of (25) items applied on a sample of (150) kindergarteners to identify these skills as well as to identify the significant difference between male and female children and if there is a difference between pre-school children and kindergarteners. The results showed the presence of pre-writing skills with a high degree in kindergarten children. The differences were clear in these skills between male and female children and those in pre-school than those in kindergartens. The researcher suggested a number of recommendations and proposals.
Tourism plays an important role in Malaysia’s economic development as it can boost business opportunity in its surrounding economic. By apply data mining on tourism data for predicting the area of business opportunity is a good choice. Data mining is the process that takes data as input and produces outputs knowledge. Due to the population of travelling in Asia country has increased in these few years. Many entrepreneurs start their owns business but there are some problems such as wrongly invest in the business fields and bad services quality which affected their business income. The objective of this paper is to use data mining technology to meet the business needs and customer needs of tourism enterprises and find the most effective
... Show More
The great scientific progress has led to widespread Information as information accumulates in large databases is important in trying to revise and compile this vast amount of data and, where its purpose to extract hidden information or classified data under their relations with each other in order to take advantage of them for technical purposes.
And work with data mining (DM) is appropriate in this area because of the importance of research in the (K-Means) algorithm for clustering data in fact applied with effect can be observed in variables by changing the sample size (n) and the number of clusters (K)
... Show MoreActivities associated with mining of uranium have generated significant quantities of waste materials containing uranium and other toxic metals. A qualitative and quantitative study was performed to assess the situation of nuclear pollution resulting from waste of drilling and exploration left on the surface layer of soil surrounding the abandoned uranium mine hole located in the southern of Najaf province in Iraq state. To measure the specific activity, twenty five surface soil samples were collected, prepared and analyzed by using gamma- ray spectrometer based on high counting efficiency NaI(Tl) scintillation detector. The results showed that the specific activities in Bq/kg are 37.31 to 1112.47 with mean of 268.16, 0.28 to 18.57 with
... Show MoreAbstract:
The aim of the current study lies in showing and demonstrating the effect of the independent variable of organizational integrity with its dimensions represented by (organizational optimism, organizational trust, organizational integrity, organizational sympathy) on the dependent variable of strategic drifts through its dimensions represented by (organizational culture, leadership, innovation, strategic planning) .The analytical descriptive approach has been adopted and the questionnaire has been used as a tool to obtain data and information from the sample of (57) people who are at the administrative levels (seni
... Show MoreAssociation rules mining (ARM) is a fundamental and widely used data mining technique to achieve useful information about data. The traditional ARM algorithms are degrading computation efficiency by mining too many association rules which are not appropriate for a given user. Recent research in (ARM) is investigating the use of metaheuristic algorithms which are looking for only a subset of high-quality rules. In this paper, a modified discrete cuckoo search algorithm for association rules mining DCS-ARM is proposed for this purpose. The effectiveness of our algorithm is tested against a set of well-known transactional databases. Results indicate that the proposed algorithm outperforms the existing metaheuristic methods.
APDBN Rashid, Southern African Linguistics and Applied Language Studies, 2023
The need for an efficient method to find the furthermost appropriate document corresponding to a particular search query has become crucial due to the exponential development in the number of papers that are now readily available to us on the web. The vector space model (VSM) a perfect model used in “information retrieval”, represents these words as a vector in space and gives them weights via a popular weighting method known as term frequency inverse document frequency (TF-IDF). In this research, work has been proposed to retrieve the most relevant document focused on representing documents and queries as vectors comprising average term term frequency inverse sentence frequency (TF-ISF) weights instead of representing them as v
... Show More