In this paper, some commonly used hierarchical cluster techniques have been compared. A comparison was made between the agglomerative hierarchical clustering technique and the k-means technique, which includes the k-mean technique, the variant K-means technique, and the bisecting K-means, although the hierarchical cluster technique is considered to be one of the best clustering methods. It has a limited usage due to the time complexity. The results, which are calculated based on the analysis of the characteristics of the cluster algorithms and the nature of the data, showed that the bisecting K-means technique is the best compared to the rest of the other methods used.
Feature selection, a method of dimensionality reduction, is nothing but collecting a range of appropriate feature subsets from the total number of features. In this paper, a point by point explanation review about the feature selection in this segment preferred affairs and its appraisal techniques are discussed. I will initiate my conversation with a straightforward approach so that we consider taking care of features and preferred issues depending upon meta-heuristic strategy. These techniques help in obtaining the best highlight subsets. Thereafter, this paper discusses some system models that drive naturally from the environment are discussed and calculations are performed so that we can take care of the prefe
... Show MoreBig data analysis is essential for modern applications in areas such as healthcare, assistive technology, intelligent transportation, environment and climate monitoring. Traditional algorithms in data mining and machine learning do not scale well with data size. Mining and learning from big data need time and memory efficient techniques, albeit the cost of possible loss in accuracy. We have developed a data aggregation structure to summarize data with large number of instances and data generated from multiple data sources. Data are aggregated at multiple resolutions and resolution provides a trade-off between efficiency and accuracy. The structure is built once, updated incrementally, and serves as a common data input for multiple mining an
... Show MoreThe research dealt with a comparative study between some semi-parametric estimation methods to the Partial linear Single Index Model using simulation. There are two approaches to model estimation two-stage procedure and MADE to estimate this model. Simulations were used to study the finite sample performance of estimating methods based on different Single Index models, error variances, and different sample sizes , and the mean average squared errors were used as a comparison criterion between the methods were used. The results showed a preference for the two-stage procedure depending on all the cases that were used
Because of the experience of the mixture problem of high correlation and the existence of linear MultiCollinearity between the explanatory variables, because of the constraint of the unit and the interactions between them in the model, which increases the existence of links between the explanatory variables and this is illustrated by the variance inflation vector (VIF), L-Pseudo component to reduce the bond between the components of the mixture.
To estimate the parameters of the mixture model, we used in our research the use of methods that increase bias and reduce variance, such as the Ridge Regression Method and the Least Absolute Shrinkage and Selection Operator (LASSO) method a
... Show MoreThis manuscript presents several applications for solving special kinds of ordinary and partial differential equations using iteration methods such as Adomian decomposition method (ADM), Variation iterative method (VIM) and Taylor series method. These methods can be applied as well as to solve nonperturbed problems and 3rd order parabolic PDEs with variable coefficient. Moreover, we compare the results using ADM, VIM and Taylor series method. These methods are a commination of the two initial conditions.
In this paper, we have derived Bayesian estimation for the parameters and reliability function of Perks distribution based on two different loss functions, Lindley’s approximation has been used to obtain those values. It is assumed that the parameter behaves as a random variable have a Gumbell Type P prior with non-informative is used. And after the derivation of mathematical formulas of those estimations, the simulation method was used for comparison depending on mean square error (MSE) values and integrated mean absolute percentage error (IMAPE) values respectively. Among of conclusion that have been reached, it is observed that, the LE-NR estimate introduced the best perform for estimating the parameter λ.