Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.
In this paper, one of the Machine Scheduling Problems is studied, which is the problem of scheduling a number of products (n-jobs) on one (single) machine with the multi-criteria objective function. These functions are (completion time, the tardiness, the earliness, and the late work) which formulated as . The branch and bound (BAB) method are used as the main method for solving the problem, where four upper bounds and one lower bound are proposed and a number of dominance rules are considered to reduce the number of branches in the search tree. The genetic algorithm (GA) and the particle swarm optimization (PSO) are used to obtain two of the upper bounds. The computational results are calculated by coding (progr
... Show MoreExtractive multi-document text summarization – a summarization with the aim of removing redundant information in a document collection while preserving its salient sentences – has recently enjoyed a large interest in proposing automatic models. This paper proposes an extractive multi-document text summarization model based on genetic algorithm (GA). First, the problem is modeled as a discrete optimization problem and a specific fitness function is designed to effectively cope with the proposed model. Then, a binary-encoded representation together with a heuristic mutation and a local repair operators are proposed to characterize the adopted GA. Experiments are applied to ten topics from Document Understanding Conference DUC2002 datas
... Show MoreFinancial fraud remains an ever-increasing problem in the financial industry with numerous consequences. The detection of fraudulent online transactions via credit cards has always been done using data mining (DM) techniques. However, fraud detection on credit card transactions (CCTs), which on its own, is a DM problem, has become a serious challenge because of two major reasons, (i) the frequent changes in the pattern of normal and fraudulent online activities, and (ii) the skewed nature of credit card fraud datasets. The detection of fraudulent CCTs mainly depends on the data sampling approach. This paper proposes a combined SVM- MPSO-MMPSO technique for credit card fraud detection. The dataset of CCTs which co
... Show MoreNowad ays, with the development of internet communication that provides many facilities to the user leads in turn to growing unauthorized access. As a result, intrusion detection system (IDS) becomes necessary to provide a high level of security for huge amount of information transferred in the network to protect them from threats. One of the main challenges for IDS is the high dimensionality of the feature space and how the relevant features to distinguish the normal network traffic from attack network are selected. In this paper, multi-objective evolutionary algorithm with decomposition (MOEA/D) and MOEA/D with the injection of a proposed local search operator are adopted to solve the Multi-objective optimization (MOO) followed by Naï
... Show MoreIn this paper Hermite interpolation method is used for solving linear and non-linear second order singular multi point boundary value problems with nonlocal condition. The approximate solution is found in the form of a rapidly convergent polynomial. We discuss behavior of the solution in the neighborhood of the singularity point which appears to perform satisfactorily for singular problems. The examples to demonstrate the applicability and efficiency of the method have been given.
The theory of Multi-Criteria Decision Making (MCDM) was introduced in the second half of the twentieth century and aids the decision maker to resolve problems when interacting criteria are involved and need to be evaluated. In this paper, we apply MCDM on the problem of the best drug for rheumatoid arthritis disease. Then, we solve the MCDM problem via -Sugeno measure and the Choquet integral to provide realistic values in the process of selecting the most appropriate drug. The approach confirms the proper interpretation of multi-criteria decision making in the drug ranking for rheumatoid arthritis.
This paper tackles with principal component analysis method (PCA ) to dimensionality reduction in the case of linear combinations to digital image processing and analysis. The PCA is statistical technique that shrinkages a multivariate data set consisting of inter-correlated variables into a data set consisting of variables that are uncorrelated linear combination, while ensuring the least possible loss of useful information. This method was applied to a group of satellite images of a certain area in the province of Basra, which represents the mouth of the Tigris and Euphrates rivers in the Shatt al-Arab in the province of Basra.
... Show MoreIn real world, almost all networks evolve over time. For example, in networks of friendships and acquaintances, people continually create and delete friendship relationship connections over time, thereby add and draw friends, and some people become part of new social networks or leave their networks, changing the nodes in the network. Recently, tracking communities encountering topological shifting drawn significant attentions and many successive algorithms have been proposed to model the problem. In general, evolutionary clustering can be defined as clustering data over time wherein two concepts: snapshot quality and temporal smoothness should be considered. Snapshot quality means that the clusters should be as precise as possible durin
... Show More