Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.
Linear discriminant analysis and logistic regression are the most widely used in multivariate statistical methods for analysis of data with categorical outcome variables .Both of them are appropriate for the development of linear classification models .linear discriminant analysis has been that the data of explanatory variables must be distributed multivariate normal distribution. While logistic regression no assumptions on the distribution of the explanatory data. Hence ,It is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions.
In this paper we have been focus for the comparison between three forms for classification data belongs
... Show MoreObjective: Breast cancer is regarded as a deadly disease in women causing lots of mortalities. Early diagnosis of breast cancer with appropriate tumor biomarkers may facilitate early treatment of the disease, thus reducing the mortality rate. The purpose of the current study is to improve early diagnosis of breast by proposing a two-stage classification of breast tumor biomarkers fora sample of Iraqi women.
Methods: In this study, a two-stage classification system is proposed and tested with four machine learning classifiers. In the first stage, breast features (demographic, blood and salivary-based attributes) are classified into normal or abnormal cases, while in the second stage the abnormal breast cases are
... Show MoreThe interests toward developing accurate automatic face emotion recognition methodologies are growing vastly, and it is still one of an ever growing research field in the region of computer vision, artificial intelligent and automation. However, there is a challenge to build an automated system which equals human ability to recognize facial emotion because of the lack of an effective facial feature descriptor and the difficulty of choosing proper classification method. In this paper, a geometric based feature vector has been proposed. For the classification purpose, three different types of classification methods are tested: statistical, artificial neural network (NN) and Support Vector Machine (SVM). A modified K-Means clustering algorithm
... Show MoreA substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques.
... Show MoreEarly detection of brain tumors is critical for enhancing treatment options and extending patient survival. Magnetic resonance imaging (MRI) scanning gives more detailed information, such as greater contrast and clarity than any other scanning method. Manually dividing brain tumors from many MRI images collected in clinical practice for cancer diagnosis is a tough and time-consuming task. Tumors and MRI scans of the brain can be discovered using algorithms and machine learning technologies, making the process easier for doctors because MRI images can appear healthy when the person may have a tumor or be malignant. Recently, deep learning techniques based on deep convolutional neural networks have been used to analyze med
... Show MoreThe purpose of this work is to study the classification and construction of (k,3)-arcs in the projective plane PG(2,7). We found that there are two (5,3)-arcs, four (6,3)-arcs, six (7,3)arcs, six (8,3)-arcs, seven (9,3)-arcs, six (10,3)-arcs and six (11,3)-arcs. All of these arcs are incomplete. The number of distinct (12,3)-arcs are six, two of them are complete. There are four distinct (13,3)-arcs, two of them are complete and one (14,3)-arc which is incomplete. There exists one complete (15,3)-arc.