Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.
The support vector machine, also known as SVM, is a type of supervised learning model that can be used for classification or regression depending on the datasets. SVM is used to classify data points by determining the best hyperplane between two or more groups. Working with enormous datasets, on the other hand, might result in a variety of issues, including inefficient accuracy and time-consuming. SVM was updated in this research by applying some non-linear kernel transformations, which are: linear, polynomial, radial basis, and multi-layer kernels. The non-linear SVM classification model was illustrated and summarized in an algorithm using kernel tricks. The proposed method was examined using three simulation datasets with different sample
... Show MoreThis study investigates the impact of spatial resolution enhancement on supervised classification accuracy using Landsat 9 satellite imagery, achieved through pan-sharpening techniques leveraging Sentinel-2 data. Various methods were employed to synthesize a panchromatic (PAN) band from Sentinel-2 data, including dimension reduction algorithms and weighted averages based on correlation coefficients and standard deviation. Three pan-sharpening algorithms (Gram-Schmidt, Principal Components Analysis, Nearest Neighbour Diffusion) were employed, and their efficacy was assessed using seven fidelity criteria. Classification tasks were performed utilizing Support Vector Machine and Maximum Likelihood algorithms. Results reveal that specifi
... Show MoreAbstract
The objective of image fusion is to merge multiple sources of images together in such a way that the final representation contains higher amount of useful information than any input one.. In this paper, a weighted average fusion method is proposed. It depends on using weights that are extracted from source images using counterlet transform. The extraction method is done by making the approximated transformed coefficients equal to zero, then taking the inverse counterlet transform to get the details of the images to be fused. The performance of the proposed algorithm has been verified on several grey scale and color test images, and compared with some present methods.
... Show MoreSolar photovoltaic (PV) system has emerged as one of the most promising technology to generate clean energy. In this work, the performance of monocrystalline silicon photovoltaic module is studied through observing the effect of necessary parameters: solar irradiation and ambient temperature. The single diode model with series resistors is selected to find the characterization of current-voltage (I-V) and power-voltage (P-V) curves by determining the values of five parameters ( ). This model shows a high accuracy in modeling the solar PV module under various weather conditions. The modeling is simulated via using MATLAB/Simulink software. The performance of the selected solar PV module is tested experimentally for differ
... Show MoreFeature selection (FS) constitutes a series of processes used to decide which relevant features/attributes to include and which irrelevant features to exclude for predictive modeling. It is a crucial task that aids machine learning classifiers in reducing error rates, computation time, overfitting, and improving classification accuracy. It has demonstrated its efficacy in myriads of domains, ranging from its use for text classification (TC), text mining, and image recognition. While there are many traditional FS methods, recent research efforts have been devoted to applying metaheuristic algorithms as FS techniques for the TC task. However, there are few literature reviews concerning TC. Therefore, a comprehensive overview was systematicall
... Show MoreThis research includes structure interpretation of the Yamama Formation (Lower Cretaceous) and the Naokelekan Formation (Jurassic) using 2D seismic reflection data of the Tuba oil field region, Basrah, southern Iraq. The two reflectors (Yamama and Naokelekan) were defined and picked as peak and tough depending on the 2D seismic reflection interpretation process, based on the synthetic seismogram and well log data. In order to obtain structural settings, these horizons were followed over all the regions. Two-way travel-time maps, depth maps, and velocity maps have been produced for top Yamama and top Naokelekan formations. The study concluded that certain longitudinal enclosures reflect anticlines in the east and west of the study ar
... Show MoreThere is a great operational risk to control the day-to-day management in water treatment plants, so water companies are looking for solutions to predict how the treatment processes may be improved due to the increased pressure to remain competitive. This study focused on the mathematical modeling of water treatment processes with the primary motivation to provide tools that can be used to predict the performance of the treatment to enable better control of uncertainty and risk. This research included choosing the most important variables affecting quality standards using the correlation test. According to this test, it was found that the important parameters of raw water: Total Hardn