The estimation of the regular regression model requires several assumptions to be satisfied such as "linearity". One problem occurs by partitioning the regression curve into two (or more) parts and then joining them by threshold point(s). This situation is regarded as a linearity violation of regression. Therefore, the multiphase regression model is received increasing attention as an alternative approach which describes the changing of the behavior of the phenomenon through threshold point estimation. Maximum likelihood estimator "MLE" has been used in both model and threshold point estimations. However, MLE is not resistant against violations such as outliers' existence or in case of the heavy-tailed error distribution. The main goal of this paper is to suggest a new hybrid estimator obtained by an ad-hoc algorithm which relies on data driven strategy that overcomes outliers. While the minor goal is to introduce a new employment of an unweighted estimation method named "winsorization" which is a good method to get robustness in regression estimation via special technique to reduce the effect of the outliers. Another specific contribution in this paper is to suggest employing "Kernel" function as a new weight (in the scope of the researcher's knowledge).Moreover, two weighted estimations are based on robust weight functions named "Cauchy" and "Talworth". Simulations have been constructed with contamination levels (0%, 5%, and 10%) which associated with sample sizes (n=40,100). Real data application showed the superior performance of the suggested method compared with other methods using RMSE and R2 criteria.
In this research we solved numerically Boltzmann transport equation in order to calculate the transport parameters, such as, drift velocity, W, D/? (ratio of diffusion coefficient to the mobility) and momentum transfer collision frequency ?m, for purpose of determination of magnetic drift velocity WM and magnetic deflection coefficient ? for low energy electrons, that moves in the electric field E, crossed with magnetic field B, i.e; E×B, in the nitrogen, Argon, Helium and it's gases mixtures as a function of: E/N (ratio of electric field strength to the number density of gas), E/P300 (ratio of electric field strength to the gas pressure) and D/? which covered a different ranges for E/P300 at temperatures 300°k (Kelvin). The results show
... Show MoreIn this paper, we investigate the automatic recognition of emotion in text. We perform experiments with a new method of classification based on the PPM character-based text compression scheme. These experiments involve both coarse-grained classification (whether a text is emotional or not) and also fine-grained classification such as recognising Ekman’s six basic emotions (Anger, Disgust, Fear, Happiness, Sadness, Surprise). Experimental results with three datasets show that the new method significantly outperforms the traditional word-based text classification methods. The results show that the PPM compression based classification method is able to distinguish between emotional and nonemotional text with high accuracy, between texts invo
... Show MoreIt has increasingly been recognised that the future developments in geospatial data handling will centre on geospatial data on the web: Volunteered Geographic Information (VGI). The evaluation of VGI data quality, including positional and shape similarity, has become a recurrent subject in the scientific literature in the last ten years. The OpenStreetMap (OSM) project is the most popular one of the leading platforms of VGI datasets. It is an online geospatial database to produce and supply free editable geospatial datasets for a worldwide. The goal of this paper is to present a comprehensive overview of the quality assurance of OSM data. In addition, the credibility of open source geospatial data is discussed, highlight
... Show MoreIn data mining, classification is a form of data analysis that can be used to extract models describing important data classes. Two of the well known algorithms used in data mining classification are Backpropagation Neural Network (BNN) and Naïve Bayesian (NB). This paper investigates the performance of these two classification methods using the Car Evaluation dataset. Two models were built for both algorithms and the results were compared. Our experimental results indicated that the BNN classifier yield higher accuracy as compared to the NB classifier but it is less efficient because it is time-consuming and difficult to analyze due to its black-box implementation.
Modern civilization increasingly relies on sustainable and eco-friendly data centers as the core hubs of intelligent computing. However, these data centers, while vital, also face heightened vulnerability to hacking due to their role as the convergence points of numerous network connection nodes. Recognizing and addressing this vulnerability, particularly within the confines of green data centers, is a pressing concern. This paper proposes a novel approach to mitigate this threat by leveraging swarm intelligence techniques to detect prospective and hidden compromised devices within the data center environment. The core objective is to ensure sustainable intelligent computing through a colony strategy. The research primarily focusses on the
... Show MoreLoanwords are the words transferred from one language to another, which become essential part of the borrowing language. The loanwords have come from the source language to the recipient language because of many reasons. Detecting these loanwords is complicated task due to that there are no standard specifications for transferring words between languages and hence low accuracy. This work tries to enhance this accuracy of detecting loanwords between Turkish and Arabic language as a case study. In this paper, the proposed system contributes to find all possible loanwords using any set of characters either alphabetically or randomly arranged. Then, it processes the distortion in the pronunciation, and solves the problem of the missing lette
... Show MoreCloud Computing is a mass platform to serve high volume data from multi-devices and numerous technologies. Cloud tenants have a high demand to access their data faster without any disruptions. Therefore, cloud providers are struggling to ensure every individual data is secured and always accessible. Hence, an appropriate replication strategy capable of selecting essential data is required in cloud replication environments as the solution. This paper proposed a Crucial File Selection Strategy (CFSS) to address poor response time in a cloud replication environment. A cloud simulator called CloudSim is used to conduct the necessary experiments, and results are presented to evidence the enhancement on replication performance. The obtained an
... Show MoreData Driven Requirement Engineering (DDRE) represents a vision for a shift from the static traditional methods of doing requirements engineering to dynamic data-driven user-centered methods. Data available and the increasingly complex requirements of system software whose functions can adapt to changing needs to gain the trust of its users, an approach is needed in a continuous software engineering process. This need drives the emergence of new challenges in the discipline of requirements engineering to meet the required changes. The problem in this study was the method in data discrepancies which resulted in the needs elicitation process being hampered and in the end software development found discrepancies and could not meet the need
... Show MoreData centric techniques, like data aggregation via modified algorithm based on fuzzy clustering algorithm with voronoi diagram which is called modified Voronoi Fuzzy Clustering Algorithm (VFCA) is presented in this paper. In the modified algorithm, the sensed area divided into number of voronoi cells by applying voronoi diagram, these cells are clustered by a fuzzy C-means method (FCM) to reduce the transmission distance. Then an appropriate cluster head (CH) for each cluster is elected. Three parameters are used for this election process, the energy, distance between CH and its neighbor sensors and packet loss values. Furthermore, data aggregation is employed in each CH to reduce the amount of data transmission which le
... Show MoreAssessing the accuracy of classification algorithms is paramount as it provides insights into reliability and effectiveness in solving real-world problems. Accuracy examination is essential in any remote sensing-based classification practice, given that classification maps consistently include misclassified pixels and classification misconceptions. In this study, two imaginary satellites for Duhok province, Iraq, were captured at regular intervals, and the photos were analyzed using spatial analysis tools to provide supervised classifications. Some processes were conducted to enhance the categorization, like smoothing. The classification results indicate that Duhok province is divided into four classes: vegetation cover, buildings,
... Show More