Support vector machine (SVM) is a popular supervised learning algorithm based on margin maximization. It has a high training cost and does not scale well to a large number of data points. We propose a multiresolution algorithm MRH-SVM that trains SVM on a hierarchical data aggregation structure, which also serves as a common data input to other learning algorithms. The proposed algorithm learns SVM models using high-level data aggregates and only visits data aggregates at more detailed levels where support vectors reside. In addition to performance improvements, the algorithm has advantages such as the ability to handle data streams and datasets with imbalanced classes. Experimental results show significant performance improvements in comparison with existing SVM algorithms.
The financial markets are one of the sectors whose data is characterized by continuous movement in most of the times and it is constantly changing, so it is difficult to predict its trends , and this leads to the need of methods , means and techniques for making decisions, and that pushes investors and analysts in the financial markets to use various and different methods in order to reach at predicting the movement of the direction of the financial markets. In order to reach the goal of making decisions in different investments, where the algorithm of the support vector machine and the CART regression tree algorithm are used to classify the stock data in order to determine
... Show MoreEach project management system aims to complete the project within its identified objectives: budget, time, and quality. It is achieving the project within the defined deadline that required careful scheduling, that be attained early. Due to the nature of unique repetitive construction projects, time contingency and project uncertainty are necessary for accurate scheduling. It should be integrated and flexible to accommodate the changes without adversely affecting the construction project’s total completion time. Repetitive planning and scheduling methods are more effective and essential. However, they need continuous development because of the evolution of execution methods, essent
Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such a
... Show MoreSupport Vector Machines (SVMs) are supervised learning models used to examine data sets in order to classify or predict dependent variables. SVM is typically used for classification by determining the best hyperplane between two classes. However, working with huge datasets can lead to a number of problems, including time-consuming and inefficient solutions. This research updates the SVM by employing a stochastic gradient descent method. The new approach, the extended stochastic gradient descent SVM (ESGD-SVM), was tested on two simulation datasets. The proposed method was compared with other classification approaches such as logistic regression, naive model, K Nearest Neighbors and Random Forest. The results show that the ESGD-SVM has a
... Show MoreSupport vector machines (SVMs) are supervised learning models that analyze data for classification or regression. For classification, SVM is widely used by selecting an optimal hyperplane that separates two classes. SVM has very good accuracy and extremally robust comparing with some other classification methods such as logistics linear regression, random forest, k-nearest neighbor and naïve model. However, working with large datasets can cause many problems such as time-consuming and inefficient results. In this paper, the SVM has been modified by using a stochastic Gradient descent process. The modified method, stochastic gradient descent SVM (SGD-SVM), checked by using two simulation datasets. Since the classification of different ca
... Show MoreData generated from modern applications and the internet in healthcare is extensive and rapidly expanding. Therefore, one of the significant success factors for any application is understanding and extracting meaningful information using digital analytics tools. These tools will positively impact the application's performance and handle the challenges that can be faced to create highly consistent, logical, and information-rich summaries. This paper contains three main objectives: First, it provides several analytics methodologies that help to analyze datasets and extract useful information from them as preprocessing steps in any classification model to determine the dataset characteristics. Also, this paper provides a comparative st
... Show More