Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.
Big data analysis is essential for modern applications in areas such as healthcare, assistive technology, intelligent transportation, environment and climate monitoring. Traditional algorithms in data mining and machine learning do not scale well with data size. Mining and learning from big data need time and memory efficient techniques, albeit the cost of possible loss in accuracy. We have developed a data aggregation structure to summarize data with large number of instances and data generated from multiple data sources. Data are aggregated at multiple resolutions and resolution provides a trade-off between efficiency and accuracy. The structure is built once, updated incrementally, and serves as a common data input for multiple mining an
... Show MoreSupport vector machine (SVM) is a popular supervised learning algorithm based on margin maximization. It has a high training cost and does not scale well to a large number of data points. We propose a multiresolution algorithm MRH-SVM that trains SVM on a hierarchical data aggregation structure, which also serves as a common data input to other learning algorithms. The proposed algorithm learns SVM models using high-level data aggregates and only visits data aggregates at more detailed levels where support vectors reside. In addition to performance improvements, the algorithm has advantages such as the ability to handle data streams and datasets with imbalanced classes. Experimental results show significant performance improvements in compa
... Show MoreIn this paper, integrated quantum neural network (QNN), which is a class of feedforward
neural networks (FFNN’s), is performed through emerging quantum computing (QC) with artificial neural network(ANN) classifier. It is used in data classification technique, and here iris flower data is used as a classification signals. For this purpose independent component analysis (ICA) is used as a feature extraction technique after normalization of these signals, the architecture of (QNN’s) has inherently built in fuzzy, hidden units of these networks (QNN’s) to develop quantized representations of sample information provided by the training data set in various graded levels of certainty. Experimental results presented here show that
... Show MoreData centric techniques, like data aggregation via modified algorithm based on fuzzy clustering algorithm with voronoi diagram which is called modified Voronoi Fuzzy Clustering Algorithm (VFCA) is presented in this paper. In the modified algorithm, the sensed area divided into number of voronoi cells by applying voronoi diagram, these cells are clustered by a fuzzy C-means method (FCM) to reduce the transmission distance. Then an appropriate cluster head (CH) for each cluster is elected. Three parameters are used for this election process, the energy, distance between CH and its neighbor sensors and packet loss values. Furthermore, data aggregation is employed in each CH to reduce the amount of data transmission which le
... Show MoreData generated from modern applications and the internet in healthcare is extensive and rapidly expanding. Therefore, one of the significant success factors for any application is understanding and extracting meaningful information using digital analytics tools. These tools will positively impact the application's performance and handle the challenges that can be faced to create highly consistent, logical, and information-rich summaries. This paper contains three main objectives: First, it provides several analytics methodologies that help to analyze datasets and extract useful information from them as preprocessing steps in any classification model to determine the dataset characteristics. Also, this paper provides a comparative st
... Show MoreDeep learning has recently received a lot of attention as a feasible solution to a variety of artificial intelligence difficulties. Convolutional neural networks (CNNs) outperform other deep learning architectures in the application of object identification and recognition when compared to other machine learning methods. Speech recognition, pattern analysis, and image identification, all benefit from deep neural networks. When performing image operations on noisy images, such as fog removal or low light enhancement, image processing methods such as filtering or image enhancement are required. The study shows the effect of using Multi-scale deep learning Context Aggregation Network CAN on Bilateral Filtering Approximation (BFA) for d
... Show MoreSecure data communication across networks is always threatened with intrusion and abuse. Network Intrusion Detection System (IDS) is a valuable tool for in-depth defense of computer networks. Most research and applications in the field of intrusion detection systems was built based on analysing the several datasets that contain the attacks types using the classification of batch learning machine. The present study presents the intrusion detection system based on Data Stream Classification. Several data stream algorithms were applied on CICIDS2017 datasets which contain several new types of attacks. The results were evaluated to choose the best algorithm that satisfies high accuracy and low computation time.
Registration techniques are still considered challenging tasks to remote sensing users, especially after enormous increase in the volume of remotely sensed data being acquired by an ever-growing number of earth observation sensors. This surge in use mandates the development of accurate and robust registration procedures that can handle these data with varying geometric and radiometric properties. This paper aims to develop the traditional registration scenarios to reduce discrepancies between registered datasets in two dimensions (2D) space for remote sensing images. This is achieved by designing a computer program written in Visual Basic language following two main stages: The first stage is a traditional registration p
... Show MoreRegistration techniques are still considered challenging tasks to remote sensing users, especially after enormous increase in the volume of remotely sensed data being acquired by an ever-growing number of earth observation sensors. This surge in use mandates the development of accurate and robust registration procedures that can handle these data with varying geometric and radiometric properties. This paper aims to develop the traditional registration scenarios to reduce discrepancies between registered datasets in two dimensions (2D) space for remote sensing images. This is achieved by designing a computer program written in Visual Basic language following two main stages: The first stage is a traditional registration process by de
... Show More