Big data of different types, such as texts and images, are rapidly generated from the internet and other applications. Dealing with this data using traditional methods is not practical since it is available in various sizes, types, and processing speed requirements. Therefore, data analytics has become an important tool because only meaningful information is analyzed and extracted, which makes it essential for big data applications to analyze and extract useful information. This paper presents several innovative methods that use data analytics techniques to improve the analysis process and data management. Furthermore, this paper discusses how the revolution of data analytics based on artificial intelligence algorithms might provide improvements for many applications. In addition, critical challenges and research issues were provided based on published paper limitations to help researchers distinguish between various analytics techniques to develop highly consistent, logical, and information-rich analyses based on valuable features. Furthermore, the findings of this paper may be used to identify the best methods in each sector used in these publications, assist future researchers in their studies for more systematic and comprehensive analysis and identify areas for developing a unique or hybrid technique for data analysis.
An analytical approach based on field data was used to determine the strength capacity of large diameter bored type piles. Also the deformations and settlements were evaluated for both vertical and lateral loadings. The analytical predictions are compared to field data obtained from a proto-type test pile used at Tharthar –Tigris canal Bridge. They were found to be with acceptable agreement of 12% deviation.
Following ASTM standards D1143M-07e1,2010, a test schedule of five loading cycles were proposed for vertical loads and series of cyclic loads to simulate horizontal loading .The load test results and analytical data of 1.95
... Show MorePurpose – The Cloud computing (CC) and its services have enabled the information centers of organizations to adapt their informatic and technological infrastructure and making it more appropriate to develop flexible information systems in the light of responding to the informational and knowledge needs of their users. In this context, cloud-data governance has become more complex and dynamic, requiring an in-depth understanding of the data management strategy at these centers in terms of: organizational structure and regulations, people, technology, process, roles and responsibilities. Therefore, our paper discusses these dimensions as challenges that facing information centers in according to their data governance and the impa
... Show MoreA two time step stochastic multi-variables multi-sites hydrological data forecasting model was developed and verified using a case study. The philosophy of this model is to use the cross-variables correlations, cross-sites correlations and the two steps time lag correlations simultaneously, for estimating the parameters of the model which then are modified using the mutation process of the genetic algorithm optimization model. The objective function that to be minimized is the Akiake test value. The case study is of four variables and three sites. The variables are the monthly air temperature, humidity, precipitation, and evaporation; the sites are Sulaimania, Chwarta, and Penjwin, which are located north Iraq. The model performance was
... Show MoreLongitudinal data is becoming increasingly common, especially in the medical and economic fields, and various methods have been analyzed and developed to analyze this type of data.
In this research, the focus was on compiling and analyzing this data, as cluster analysis plays an important role in identifying and grouping co-expressed subfiles over time and employing them on the nonparametric smoothing cubic B-spline model, which is characterized by providing continuous first and second derivatives, resulting in a smoother curve with fewer abrupt changes in slope. It is also more flexible and can pick up on more complex patterns and fluctuations in the data.
The longitudinal balanced data profile was compiled into subgroup
... Show MoreMalicious software (malware) performs a malicious function that compromising a computer system’s security. Many methods have been developed to improve the security of the computer system resources, among them the use of firewall, encryption, and Intrusion Detection System (IDS). IDS can detect newly unrecognized attack attempt and raising an early alarm to inform the system about this suspicious intrusion attempt. This paper proposed a hybrid IDS for detection intrusion, especially malware, with considering network packet and host features. The hybrid IDS designed using Data Mining (DM) classification methods that for its ability to detect new, previously unseen intrusions accurately and automatically. It uses both anomaly and misuse dete
... Show MoreIt has increasingly been recognised that the future developments in geospatial data handling will centre on geospatial data on the web: Volunteered Geographic Information (VGI). The evaluation of VGI data quality, including positional and shape similarity, has become a recurrent subject in the scientific literature in the last ten years. The OpenStreetMap (OSM) project is the most popular one of the leading platforms of VGI datasets. It is an online geospatial database to produce and supply free editable geospatial datasets for a worldwide. The goal of this paper is to present a comprehensive overview of the quality assurance of OSM data. In addition, the credibility of open source geospatial data is discussed, highlight
... Show MoreGround-based active optical sensors (GBAOS) have been successfully used in agriculture to predict crop yield potential (YP) early in the season and to improvise N rates for optimal crop yield. However, the models were found weak or inconsistent due to environmental variation especially rainfall. The objectives of the study were to evaluate if GBAOS could predict YP across multiple locations, soil types, cultivation systems, and rainfall differences. This study was carried from 2011 to 2013 on corn (Zea mays L.) in North Dakota, and in 2017 in potatoes in Maine. Six N rates were used on 50 sites in North Dakota and 12 N rates on two sites, one dryland and one irrigated, in Maine. Two active GBAOS used for this study were GreenSeeker and Holl
... Show MoreSpatial data observed on a group of areal units is common in scientific applications. The usual hierarchical approach for modeling this kind of dataset is to introduce a spatial random effect with an autoregressive prior. However, the usual Markov chain Monte Carlo scheme for this hierarchical framework requires the spatial effects to be sampled from their full conditional posteriors one-by-one resulting in poor mixing. More importantly, it makes the model computationally inefficient for datasets with large number of units. In this article, we propose a Bayesian approach that uses the spectral structure of the adjacency to construct a low-rank expansion for modeling spatial dependence. We propose a pair of computationally efficient estimati
... Show MoreThe issue of penalized regression model has received considerable critical attention to variable selection. It plays an essential role in dealing with high dimensional data. Arctangent denoted by the Atan penalty has been used in both estimation and variable selection as an efficient method recently. However, the Atan penalty is very sensitive to outliers in response to variables or heavy-tailed error distribution. While the least absolute deviation is a good method to get robustness in regression estimation. The specific objective of this research is to propose a robust Atan estimator from combining these two ideas at once. Simulation experiments and real data applications show that the p
... Show More