Preferred Language
Search for risk haplotype segments with GWAS data by use of finite mixture models
...Show More Authors

The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.

Scopus Clarivate Crossref
View Publication
Publication Date
Wed Mar 26 2025
Journal Name
Journal Of Systems Science And Mathematical Sciences
...Show More Authors

The haplotype association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease.Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls.It starts with inferring haplotypes from genotypes followed by a haplotype co-classification and marginal screening for disease-associated haplotypes.Unfortunately,phasing uncertainty may have a strong effects on the haplotype co-classification and therefore on the accuracy of predicting risk haplotypes.Here,to address the issue,we propose an alternative approach:In Stage 1,we select potential risk genotypes inste

... Show More
View Publication
Publication Date
Fri Apr 01 2022
Journal Name
Baghdad Science Journal
Improved Firefly Algorithm with Variable Neighborhood Search for Data Clustering
...Show More Authors

Among the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On the

... Show More
View Publication Preview PDF
Scopus (12)
Crossref (3)
Scopus Clarivate Crossref
Publication Date
Wed Jun 29 2022
Journal Name
Journal Of Al-rafidain University College For Sciences ( Print Issn: 1681-6870 ,online Issn: 2790-2293 )
The Use Of Genetic Algorithm In Estimating The Parameter Of Finite Mixture Of Linear Regression
...Show More Authors

The estimation of the parameters of linear regression is based on the usual Least Square method, as this method is based on the estimation of several basic assumptions. Therefore, the accuracy of estimating the parameters of the model depends on the validity of these hypotheses. The most successful technique was the robust estimation method which is minimizing maximum likelihood estimator (MM-estimator) that proved its efficiency in this purpose. However, the use of the model becomes unrealistic and one of these assumptions is the uniformity of the variance and the normal distribution of the error. These assumptions are not achievable in the case of studying a specific problem that may include complex data of more than one model. To

... Show More
View Publication
Publication Date
Thu Sep 30 2021
Journal Name
Journal Of Economics And Administrative Sciences
Comparison of Some Methods for Estimating Mixture of Linear Regression Models with Application
...Show More Authors

 A mixture model is used to model data that come from more than one component. In recent years, it became an effective tool in drawing inferences about the complex data that we might come across in real life. Moreover, it can represent a tremendous confirmatory tool in classification observations based on similarities amongst them. In this paper, several mixture regression-based methods were conducted under the assumption that the data come from a finite number of components. A comparison of these methods has been made according to their results in estimating component parameters. Also, observation membership has been inferred and assessed for these methods. The results showed that the flexible mixture model outperformed the others

... Show More
Publication Date
Thu Sep 30 2021
Journal Name
Journal Of Economics And Administrative Sciences
Comparison of Some Methods for Estimating Mixture of Linear Regression Models with Application
...Show More Authors

 A mixture model is used to model data that come from more than one component. In recent years, it became an effective tool in drawing inferences about the complex data that we might come across in real life. Moreover, it can represent a tremendous confirmatory tool in classification observations based on similarities amongst them. In this paper, several mixture regression-based methods were conducted under the assumption that the data come from a finite number of components. A comparison of these methods has been made according to their results in estimating component parameters. Also, observation membership has been inferred and assessed for these methods. The results showed that the flexible mixture model outperformed the

... Show More
View Publication Preview PDF
Publication Date
Fri Dec 30 2022
Journal Name
Journal Of Mathematics
Estimation of Parameters of Finite Mixture of Rayleigh Distribution by the Expectation-Maximization Algorithm
...Show More Authors

In the lifetime process in some systems, most data cannot belong to one single population. In fact, it can represent several subpopulations. In such a case, the known distribution cannot be used to model data. Instead, a mixture of distribution is used to modulate the data and classify them into several subgroups. The mixture of Rayleigh distribution is best to be used with the lifetime process. This paper aims to infer model parameters by the expectation-maximization (EM) algorithm through the maximum likelihood function. The technique is applied to simulated data by following several scenarios. The accuracy of estimation has been examined by the average mean square error (AMSE) and the average classification success rate (ACSR). T

... Show More
View Publication Preview PDF
Scopus (4)
Crossref (1)
Scopus Clarivate Crossref
Publication Date
Mon Oct 21 2024
Journal Name
Iraqi Statisticians Journal
On Inference of Finite Mixture of Rayleigh Distribution by Gibbs Sampler and Metropolis-Hastings
...Show More Authors

Inferential methods of statistical distributions have reached a high level of interest in recent years. However, in real life, data can follow more than one distribution, and then mixture models must be fitted to such data. One of which is a finite mixture of Rayleigh distribution that is widely used in modelling lifetime data in many fields, such as medicine, agriculture and engineering. In this paper, we proposed a new Bayesian frameworks by assuming conjugate priors for the square of the component parameters. We used this prior distribution in the classical Bayesian, Metropolis-hasting (MH) and Gibbs sampler methods. The performance of these techniques were assessed by conducting data which was generated from two and three-component mixt

... Show More
View Publication
Publication Date
Wed Jan 11 2023
Journal Name
Mathematical Problems In Engineering
Bayesian Methods for Estimation the Parameters of Finite Mixture of Inverse Rayleigh Distribution
...Show More Authors

Methods of estimating statistical distribution have attracted many researchers when it comes to fitting a specific distribution to data. However, when the data belong to more than one component, a popular distribution cannot be fitted to such data. To tackle this issue, mixture models are fitted by choosing the correct number of components that represent the data. This can be obvious in lifetime processes that are involved in a wide range of engineering applications as well as biological systems. In this paper, we introduce an application of estimating a finite mixture of Inverse Rayleigh distribution by the use of the Bayesian framework when considering the model as Markov chain Monte Carlo (MCMC). We employed the Gibbs sampler and

... Show More
View Publication Preview PDF
Scopus (2)
Scopus Clarivate Crossref
Publication Date
Sat Dec 31 2022
Journal Name
Journal Of Economics And Administrative Sciences
Using Some Estimation Methods for Mixed-Random Panel Data Regression Models with Serially Correlated Errors with Application
...Show More Authors

This research includes the study of dual data models with mixed random parameters, which contain two types of parameters, the first is random and the other is fixed. For the random parameter, it is obtained as a result of differences in the marginal tendencies of the cross sections, and for the fixed parameter, it is obtained as a result of differences in fixed limits, and random errors for each section. Accidental bearing the characteristic of heterogeneity of variance in addition to the presence of serial correlation of the first degree, and the main objective in this research is the use of efficient methods commensurate with the paired data in the case of small samples, and to achieve this goal, the feasible general least squa

... Show More
View Publication Preview PDF
Publication Date
Wed Aug 01 2018
Journal Name
Journal Of Economics And Administrative Sciences
Compare to the conditional logistic regression models with fixed and mixed effects for longitudinal data
...Show More Authors

Mixed-effects conditional logistic regression is evidently more effective in the study of qualitative differences in longitudinal pollution data as well as their implications on heterogeneous subgroups. This study seeks that conditional logistic regression is a robust evaluation method for environmental studies, thru the analysis of environment pollution as a function of oil production and environmental factors. Consequently, it has been established theoretically that the primary objective of model selection in this research is to identify the candidate model that is optimal for the conditional design. The candidate model should achieve generalizability, goodness-of-fit, parsimony and establish equilibrium between bias and variab

... Show More
View Publication Preview PDF