In recent years, the performance of Spatial Data Infrastructures for governments and companies is a task that has gained ample attention. Different categories of geospatial data such as digital maps, coordinates, web maps, aerial and satellite images, etc., are required to realize the geospatial data components of Spatial Data Infrastructures. In general, there are two distinct types of geospatial data sources exist over the Internet: formal and informal data sources. Despite the growth of informal geospatial data sources, the integration between different free sources is not being achieved effectively. The adoption of this task can be considered the main advantage of this research. This article addresses the research question of how the integration of free geospatial data can be beneficial within domains such as Spatial Data Infrastructures. This was carried out by suggesting a common methodology that uses road networks information such as lengths, centeroids, start and end points, number of nodes and directions to integrate free and open source geospatial datasets. The methodology has been proposed for a particular case study: the use of geospatial data from OpenStreetMap and Google Earth datasets as examples of free data sources. The results revealed possible matching between the roads of OpenStreetMap and Google Earth datasets to serve the development of Spatial Data Infrastructures.
Purpose – The Cloud computing (CC) and its services have enabled the information centers of organizations to adapt their informatic and technological infrastructure and making it more appropriate to develop flexible information systems in the light of responding to the informational and knowledge needs of their users. In this context, cloud-data governance has become more complex and dynamic, requiring an in-depth understanding of the data management strategy at these centers in terms of: organizational structure and regulations, people, technology, process, roles and responsibilities. Therefore, our paper discusses these dimensions as challenges that facing information centers in according to their data governance and the impa
... Show MoreA two time step stochastic multi-variables multi-sites hydrological data forecasting model was developed and verified using a case study. The philosophy of this model is to use the cross-variables correlations, cross-sites correlations and the two steps time lag correlations simultaneously, for estimating the parameters of the model which then are modified using the mutation process of the genetic algorithm optimization model. The objective function that to be minimized is the Akiake test value. The case study is of four variables and three sites. The variables are the monthly air temperature, humidity, precipitation, and evaporation; the sites are Sulaimania, Chwarta, and Penjwin, which are located north Iraq. The model performance was
... Show MoreLongitudinal data is becoming increasingly common, especially in the medical and economic fields, and various methods have been analyzed and developed to analyze this type of data.
In this research, the focus was on compiling and analyzing this data, as cluster analysis plays an important role in identifying and grouping co-expressed subfiles over time and employing them on the nonparametric smoothing cubic B-spline model, which is characterized by providing continuous first and second derivatives, resulting in a smoother curve with fewer abrupt changes in slope. It is also more flexible and can pick up on more complex patterns and fluctuations in the data.
The longitudinal balanced data profile was compiled into subgroup
... Show MoreIn this study, we made a comparison between LASSO & SCAD methods, which are two special methods for dealing with models in partial quantile regression. (Nadaraya & Watson Kernel) was used to estimate the non-parametric part ;in addition, the rule of thumb method was used to estimate the smoothing bandwidth (h). Penalty methods proved to be efficient in estimating the regression coefficients, but the SCAD method according to the mean squared error criterion (MSE) was the best after estimating the missing data using the mean imputation method
Anomaly detection is still a difficult task. To address this problem, we propose to strengthen DBSCAN algorithm for the data by converting all data to the graph concept frame (CFG). As is well known that the work DBSCAN method used to compile the data set belong to the same species in a while it will be considered in the external behavior of the cluster as a noise or anomalies. It can detect anomalies by DBSCAN algorithm can detect abnormal points that are far from certain set threshold (extremism). However, the abnormalities are not those cases, abnormal and unusual or far from a specific group, There is a type of data that is do not happen repeatedly, but are considered abnormal for the group of known. The analysis showed DBSCAN using the
... Show MoreIn data mining, classification is a form of data analysis that can be used to extract models describing important data classes. Two of the well known algorithms used in data mining classification are Backpropagation Neural Network (BNN) and Naïve Bayesian (NB). This paper investigates the performance of these two classification methods using the Car Evaluation dataset. Two models were built for both algorithms and the results were compared. Our experimental results indicated that the BNN classifier yield higher accuracy as compared to the NB classifier but it is less efficient because it is time-consuming and difficult to analyze due to its black-box implementation.
The majority of systems dealing with natural language processing (NLP) and artificial intelligence (AI) can assist in making automated and automatically-supported decisions. However, these systems may face challenges and difficulties or find it confusing to identify the required information (characterization) for eliciting a decision by extracting or summarizing relevant information from large text documents or colossal content. When obtaining these documents online, for instance from social networking or social media, these sites undergo a remarkable increase in the textual content. The main objective of the present study is to conduct a survey and show the latest developments about the implementation of text-mining techniqu
... Show MoreMachine learning has a significant advantage for many difficulties in the oil and gas industry, especially when it comes to resolving complex challenges in reservoir characterization. Permeability is one of the most difficult petrophysical parameters to predict using conventional logging techniques. Clarifications of the work flow methodology are presented alongside comprehensive models in this study. The purpose of this study is to provide a more robust technique for predicting permeability; previous studies on the Bazirgan field have attempted to do so, but their estimates have been vague, and the methods they give are obsolete and do not make any concessions to the real or rigid in order to solve the permeability computation. To
... Show MoreCryptography is the process of transforming message to avoid an unauthorized access of data. One of the main problems and an important part in cryptography with secret key algorithms is key. For higher level of secure communication key plays an important role. For increasing the level of security in any communication, both parties must have a copy of the secret key which, unfortunately, is not that easy to achieve. Triple Data Encryption Standard algorithm is weak due to its weak key generation, so that key must be reconfigured to make this algorithm more secure, effective, and strong. Encryption key enhances the Triple Data Encryption Standard algorithm securities. This paper proposed a combination of two efficient encryption algorithms
... Show MoreIn data transmission a change in single bit in the received data may lead to miss understanding or a disaster. Each bit in the sent information has high priority especially with information such as the address of the receiver. The importance of error detection with each single change is a key issue in data transmission field.
The ordinary single parity detection method can detect odd number of errors efficiently, but fails with even number of errors. Other detection methods such as two-dimensional and checksum showed better results and failed to cope with the increasing number of errors.
Two novel methods were suggested to detect the binary bit change errors when transmitting data in a noisy media.Those methods were: 2D-Checksum me