Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.
One wide-ranging category of open source data is that referring to geospatial information web sites. Despite the advantages of such open source data, including ease of access and cost free data, there is a potential issue of its quality. This article tests the horizontal positional accuracy and possible integration of four web-derived geospatial datasets: OpenStreetMap (OSM), Google Map, Google Earth and Wikimapia. The evaluation was achieved by combining the tested information with reference field survey data for fifty road intersections in Baghdad, Iraq. The results indicate that the free geospatial data can be used to enhance authoritative maps especially small scale maps.
Among the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On the
... Show MoreThis research study Blur groups (Fuzzy Sets) which is the perception of the most modern in the application in various practical and theoretical areas and in various fields of life, was addressed to the fuzzy random variable whose value is not real, but the numbers Millbh because it expresses the mysterious phenomena or uncertain with measurements are not assertive. Fuzzy data were presented for binocular test and analysis of variance method of random Fuzzy variables , where this method depends on a number of assumptions, which is a problem that prevents the use of this method in the case of non-realized.
Abstract
The research Compared two methods for estimating fourparametersof the compound exponential Weibull - Poisson distribution which are the maximum likelihood method and the Downhill Simplex algorithm. Depending on two data cases, the first one assumed the original data (Non-polluting), while the second one assumeddata contamination. Simulation experimentswere conducted for different sample sizes and initial values of parameters and under different levels of contamination. Downhill Simplex algorithm was found to be the best method for in the estimation of the parameters, the probability function and the reliability function of the compound distribution in cases of natural and contaminateddata.
... Show More
Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for
Frictional heat is generated when the clutch starts to engag. As a result of this operation the surface temperature is increased rapidly due to the difference in speed between the driving and driven parts. The influence of the thickness of frictional facing on the distribution of the contact pressure of the multi-disc clutches has been investigated using a numerical approach (the finite element method). The analysis of contact problem has been carried out for a multiple disc dry clutch (piston, clutch discs, separators and pressure plate). The results present the distribution of the contact pressure on all tShe surfaces of friction discs that existed in the friction clutch system. Axisymmetric finite element models have been developed to ac
... Show MoreArtificial roughness on the absorber plate of a Solar Air Heater (SAH) is a popular technique for increasing its effective efficiency. The study investigated the effect of geometrical parameters of discrete multi-arc ribs (DMAR) installed below the SAH absorber plate on the effective efficiency. The effects of major roughness factors, such as number of gaps (Ng = 1-4), rib pitch (p/e = 4-16), rib height (e/D = 0.018-0.045), gab width (wg/e = 0.5-2), angle of attack ( = 30-75), and Reynolds number (Re= 2000-20000) on the performance of a SAH are studied. The performance of the SAH is evaluated using a top-down iterative technique. The results show that as Re rises, SAH-effective DMAR's efficiency first ascends to a specified value o
... Show More