This research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB), and Support Vector Machine (SVM) techniques. CART gives clear results with high accuracy between the six supervised algorithms. It is worth noting that the preprocessing steps take remarkable efforts to handle this type of data, since its pure data set has so many null values of a ratio 94.8%, then it becomes 0% after achieving the preprocessing steps. Then, in order to apply CART algorithm, several determined tests were assumed as classes. The decision to select the tests which had been assumed as classes were depending on their acquired accuracy. Consequently, enabling the physicians to trace and connect the tests result with each other, which extends its impact on patients’ health.
Tourism plays an important role in Malaysia’s economic development as it can boost business opportunity in its surrounding economic. By apply data mining on tourism data for predicting the area of business opportunity is a good choice. Data mining is the process that takes data as input and produces outputs knowledge. Due to the population of travelling in Asia country has increased in these few years. Many entrepreneurs start their owns business but there are some problems such as wrongly invest in the business fields and bad services quality which affected their business income. The objective of this paper is to use data mining technology to meet the business needs and customer needs of tourism enterprises and find the most effective
... Show More
The great scientific progress has led to widespread Information as information accumulates in large databases is important in trying to revise and compile this vast amount of data and, where its purpose to extract hidden information or classified data under their relations with each other in order to take advantage of them for technical purposes.
And work with data mining (DM) is appropriate in this area because of the importance of research in the (K-Means) algorithm for clustering data in fact applied with effect can be observed in variables by changing the sample size (n) and the number of clusters (K)
... Show Moreتمهيد
غالبا ما يكون تعامل المنظمات المالية والمصرفية مع الزبائن بشكل أساسي مما يتطلب منها جمع كميات هائلة من البيانات عن هؤلاء الزبائن هذا بالإضافة الى ما يرد اليها يوميا من بيانات يجعلها أمام أكداس كبيرة من البيانات تحتاج الى جهود جبارة تحسن التعامل معها والاستفادة منها بما يخدم المنظمة.
ان التعامل اليدوي مع مثل هذه البيانات دون استخدام تقنيات حديثة يبعد المنظمة عن التط
... Show MoreEverybody is connected with social media like (Facebook, Twitter, LinkedIn, Instagram…etc.) that generate a large quantity of data and which traditional applications are inadequate to process. Social media are regarded as an important platform for sharing information, opinion, and knowledge of many subscribers. These basic media attribute Big data also to many issues, such as data collection, storage, moving, updating, reviewing, posting, scanning, visualization, Data protection, etc. To deal with all these problems, this is a need for an adequate system that not just prepares the details, but also provides meaningful analysis to take advantage of the difficult situations, relevant to business, proper decision, Health, social media, sc
... Show MoreBusiness organizations have faced many challenges in recent times, most important of which is information technology, because it is widely spread and easy to use. Its use has led to an increase in the amount of data that business organizations deal with an unprecedented manner. The amount of data available through the internet is a problem that many parties seek to find solutions for. Why is it available there in this huge amount randomly? Many expectations have revealed that in 2017, there will be devices connected to the internet estimated at three times the population of the Earth, and in 2015 more than one and a half billion gigabytes of data was transferred every minute globally. Thus, the so-called data mining emerged as a
... Show More
Codes of red, green, and blue data (RGB) extracted from a lab-fabricated colorimeter device were used to build a proposed classifier with the objective of classifying colors of objects based on defined categories of fundamental colors. Primary, secondary, and tertiary colors namely red, green, orange, yellow, pink, purple, blue, brown, grey, white, and black, were employed in machine learning (ML) by applying an artificial neural network (ANN) algorithm using Python. The classifier, which was based on the ANN algorithm, required a definition of the mentioned eleven colors in the form of RGB codes in order to acquire the capability of classification. The software's capacity to forecast the color of the code that belongs to an ob
... Show MoreThe Machine learning methods, which are one of the most important branches of promising artificial intelligence, have great importance in all sciences such as engineering, medical, and also recently involved widely in statistical sciences and its various branches, including analysis of survival, as it can be considered a new branch used to estimate the survival and was parallel with parametric, nonparametric and semi-parametric methods that are widely used to estimate survival in statistical research. In this paper, the estimate of survival based on medical images of patients with breast cancer who receive their treatment in Iraqi hospitals was discussed. Three algorithms for feature extraction were explained: The first principal compone
... Show MoreObjective This research investigates Breast Cancer real data for Iraqi women, these data are acquired manually from several Iraqi Hospitals of early detection for Breast Cancer. Data mining techniques are used to discover the hidden knowledge, unexpected patterns, and new rules from the dataset, which implies a large number of attributes. Methods Data mining techniques manipulate the redundant or simply irrelevant attributes to discover interesting patterns. However, the dataset is processed via Weka (The Waikato Environment for Knowledge Analysis) platform. The OneR technique is used as a machine learning classifier to evaluate the attribute worthy according to the class value. Results The evaluation is performed using
... Show MoreData mining is one of the most popular analysis methods in medical research. It involves finding patterns and correlations in previously unknown datasets. Data mining encompasses various areas of biomedical research, including data collection, clinical decision support, illness or safety monitoring, public health, and inquiry research. Health analytics frequently uses computational methods for data mining, such as clustering, classification, and regression. Studies of large numbers of diverse heterogeneous documents, including biological and electronic information, provided extensive material to medical and health studies.
This paper is intended to apply data mining techniques for real Iraqi biochemical dataset to discover hidden patterns within tests relationships. It is worth noting that preprocessing steps take remarkable efforts to handle this type of data, since it is pure data set with so many null values reaching a ratio of 94.8%, then it becomes 0% after achieving these steps. However, in order to apply Classification And Regression Tree (CART) algorithm, several tests were assumed as classes, because of the dataset was unlabeled. Which then enabled discovery of patterns of tests relationships, that consequently, extends its impact on patients’ health, since it will assist in determining test values by performing only relevant
... Show More