Big data analysis is essential for modern applications in areas such as healthcare, assistive technology, intelligent transportation, environment and climate monitoring. Traditional algorithms in data mining and machine learning do not scale well with data size. Mining and learning from big data need time and memory efficient techniques, albeit the cost of possible loss in accuracy. We have developed a data aggregation structure to summarize data with large number of instances and data generated from multiple data sources. Data are aggregated at multiple resolutions and resolution provides a trade-off between efficiency and accuracy. The structure is built once, updated incrementally, and serves as a common data input for multiple mining and learning algorithms. Data mining algorithms are modified to accept the aggregated data as input. Hierarchical data aggregation serves as a paradigm under which novel …
In many oil-recovery systems, relative permeabilities (kr) are essential flow factors that affect fluid dispersion and output from petroleum resources. Traditionally, taking rock samples from the reservoir and performing suitable laboratory studies is required to get these crucial reservoir properties. Despite the fact that kr is a function of fluid saturation, it is now well established that pore shape and distribution, absolute permeability, wettability, interfacial tension (IFT), and saturation history all influence kr values. These rock/fluid characteristics vary greatly from one reservoir region to the next, and it would be impossible to make kr measurements in all of them. The unsteady-state approach was used to calculate the relat
... Show MoreThe development of information systems in recent years has contributed to various methods of gathering information to evaluate IS performance. The most common approach used to collect information is called the survey system. This method, however, suffers one major drawback. The decision makers consume considerable time to transform data from survey sheets to analytical programs. As such, this paper proposes a method called ‘survey algorithm based on R programming language’ or SABR, for data transformation from the survey sheets inside R environments by treating the arrangement of data as a relational format. R and Relational data format provide excellent opportunity to manage and analyse the accumulated data. Moreover, a survey syste
... Show MoreThe development of Web 2.0 has improved people's ability to share their opinions. These opinions serve as an important piece of knowledge for other reviewers. To figure out what the opinions is all about, an automatic system of analysis is needed. Aspect-based sentiment analysis is the most important research topic conducted to extract reviewers-opinions about certain attribute, for instance opinion-target (aspect). In aspect-based tasks, the identification of the implicit aspect such as aspects implicitly implied in a review, is the most challenging task to accomplish. However, this paper strives to identify the implicit aspects based on hierarchical algorithm incorporated with common-sense knowledge by means of dimensionality reduction.
In this research, a simple experiment in the field of agriculture was studied, in terms of the effect of out-of-control noise as a result of several reasons, including the effect of environmental conditions on the observations of agricultural experiments, through the use of Discrete Wavelet transformation, specifically (The Coiflets transform of wavelength 1 to 2 and the Daubechies transform of wavelength 2 To 3) based on two levels of transform (J-4) and (J-5), and applying the hard threshold rules, soft and non-negative, and comparing the wavelet transformation methods using real data for an experiment with a size of 26 observations. The application was carried out through a program in the language of MATLAB. The researcher concluded that
... Show MoreMixed-effects conditional logistic regression is evidently more effective in the study of qualitative differences in longitudinal pollution data as well as their implications on heterogeneous subgroups. This study seeks that conditional logistic regression is a robust evaluation method for environmental studies, thru the analysis of environment pollution as a function of oil production and environmental factors. Consequently, it has been established theoretically that the primary objective of model selection in this research is to identify the candidate model that is optimal for the conditional design. The candidate model should achieve generalizability, goodness-of-fit, parsimony and establish equilibrium between bias and variab
... Show MoreThe smart city concept has attracted high research attention in recent years within diverse application domains, such as crime suspect identification, border security, transportation, aerospace, and so on. Specific focus has been on increased automation using data driven approaches, while leveraging remote sensing and real-time streaming of heterogenous data from various resources, including unmanned aerial vehicles, surveillance cameras, and low-earth-orbit satellites. One of the core challenges in exploitation of such high temporal data streams, specifically videos, is the trade-off between the quality of video streaming and limited transmission bandwidth. An optimal compromise is needed between video quality and subsequently, rec
... Show MoreThe usage of remote sensing techniques in managing and monitoring the environmental areas is increasing due to the improvement of the sensors used in the observation satellites around the earth. Resolution merge process is used to combine high resolution one band image with another one that have low resolution multi bands image to produce one image that is high in both spatial and spectral resolution. In this work different merging methods were tested to evaluate their enhancement capabilities to extract different environmental areas; Principle component analysis (PCA), Brovey, modified (Intensity, Hue ,Saturation) method and High Pass Filter methods were tested and subjected to visual and statistical comparison for evaluation. Both visu
... Show MoreThe distribution of the intensity of the comet Ison C/2013 is studied by taking its histogram. This distribution reveals four distinct regions that related to the background, tail, coma and nucleus. One dimensional temperature distribution fitting is achieved by using two mathematical equations that related to the coordinate of the center of the comet. The quiver plot of the gradient of the comet shows very clearly that arrows headed towards the maximum intensity of the comet.