Big data of different types, such as texts and images, are rapidly generated from the internet and other applications. Dealing with this data using traditional methods is not practical since it is available in various sizes, types, and processing speed requirements. Therefore, data analytics has become an important tool because only meaningful information is analyzed and extracted, which makes it essential for big data applications to analyze and extract useful information. This paper presents several innovative methods that use data analytics techniques to improve the analysis process and data management. Furthermore, this paper discusses how the revolution of data analytics based on artificial intelligence algorithms might provide improvements for many applications. In addition, critical challenges and research issues were provided based on published paper limitations to help researchers distinguish between various analytics techniques to develop highly consistent, logical, and information-rich analyses based on valuable features. Furthermore, the findings of this paper may be used to identify the best methods in each sector used in these publications, assist future researchers in their studies for more systematic and comprehensive analysis and identify areas for developing a unique or hybrid technique for data analysis.
Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such a
... Show MoreIn this study, we made a comparison between LASSO & SCAD methods, which are two special methods for dealing with models in partial quantile regression. (Nadaraya & Watson Kernel) was used to estimate the non-parametric part ;in addition, the rule of thumb method was used to estimate the smoothing bandwidth (h). Penalty methods proved to be efficient in estimating the regression coefficients, but the SCAD method according to the mean squared error criterion (MSE) was the best after estimating the missing data using the mean imputation method
Today, the role of cloud computing in our day-to-day lives is very prominent. The cloud computing paradigm makes it possible to provide demand-based resources. Cloud computing has changed the way that organizations manage resources due to their robustness, low cost, and pervasive nature. Data security is usually realized using different methods such as encryption. However, the privacy of data is another important challenge that should be considered when transporting, storing, and analyzing data in the public cloud. In this paper, a new method is proposed to track malicious users who use their private key to decrypt data in a system, share it with others and cause system information leakage. Security policies are also considered to be int
... Show MoreReliable data transfer and energy efficiency are the essential considerations for network performance in resource-constrained underwater environments. One of the efficient approaches for data routing in underwater wireless sensor networks (UWSNs) is clustering, in which the data packets are transferred from sensor nodes to the cluster head (CH). Data packets are then forwarded to a sink node in a single or multiple hops manners, which can possibly increase energy depletion of the CH as compared to other nodes. While several mechanisms have been proposed for cluster formation and CH selection to ensure efficient delivery of data packets, less attention has been given to massive data co
Modern civilization increasingly relies on sustainable and eco-friendly data centers as the core hubs of intelligent computing. However, these data centers, while vital, also face heightened vulnerability to hacking due to their role as the convergence points of numerous network connection nodes. Recognizing and addressing this vulnerability, particularly within the confines of green data centers, is a pressing concern. This paper proposes a novel approach to mitigate this threat by leveraging swarm intelligence techniques to detect prospective and hidden compromised devices within the data center environment. The core objective is to ensure sustainable intelligent computing through a colony strategy. The research primarily focusses on the
... Show MoreIn data mining, classification is a form of data analysis that can be used to extract models describing important data classes. Two of the well known algorithms used in data mining classification are Backpropagation Neural Network (BNN) and Naïve Bayesian (NB). This paper investigates the performance of these two classification methods using the Car Evaluation dataset. Two models were built for both algorithms and the results were compared. Our experimental results indicated that the BNN classifier yield higher accuracy as compared to the NB classifier but it is less efficient because it is time-consuming and difficult to analyze due to its black-box implementation.
Generally, direct measurement of soil compression index (Cc) is expensive and time-consuming. To save time and effort, indirect methods to obtain Cc may be an inexpensive option. Usually, the indirect methods are based on a correlation between some easier measuring descriptive variables such as liquid limit, soil density, and natural water content. This study used the ANFIS and regression methods to obtain Cc indirectly. To achieve the aim of this investigation, 177 undisturbed samples were collected from the cohesive soil in Sulaymaniyah Governorate in Iraq. Results of this study indicated that ANFIS models over-performed the Regression method in estimating Cc with R2 of 0.66 and 0.48 for both ANFIS and Regre
... Show MoreThis paper compares between the direct and indirect georeferencing techniques in Photogrammetry bases on a simulation model. A flight plan is designed which consists of three strips with nine overlapped images for each strip by a (Canon 500D) digital camera with a resolution of 15 Mega Pixels.
The triangulation computations are carried out by using (ERDAS LPS) software, and the direct measurements are taken directly on the simulated model to substitute using GPS/INS in real case. Two computational tests have been implemented to evaluate the positional accuracy for the whole model and the Root Mean Square Error (RMSE) relating to (30) check points show that th
... Show MoreAverage per capita GDP income is an important economic indicator. Economists use this term to determine the amount of progress or decline in the country's economy. It is also used to determine the order of countries and compare them with each other. Average per capita GDP income was first studied using the Time Series (Box Jenkins method), and the second is linear and non-linear regression; these methods are the most important and most commonly used statistical methods for forecasting because they are flexible and accurate in practice. The comparison is made to determine the best method between the two methods mentioned above using specific statistical criteria. The research found that the best approach is to build a model for predi
... Show More