Preferred Language
Articles
/
yRcXCJABVTCNdQwCVoIV
Search for risk haplotype segments with GWAS data by use of finite mixture models
...Show More Authors

The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.

Scopus Clarivate Crossref
View Publication
Publication Date
Tue Dec 24 2024
Journal Name
Journal Of Systems Science And Mathematical Sciences
SCREENING TESTS FOR DISEASE RISK HAPLOTYPE SEGMENTS IN GENOME BY USE OF PERMUTATION
...Show More Authors

The haplotype association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease.Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls.It starts with inferring haplotypes from genotypes followed by a haplotype co-classification and marginal screening for disease-associated haplotypes.Unfortunately,phasing uncertainty may have a strong effects on the haplotype co-classification and therefore on the accuracy of predicting risk haplotypes.Here,to address the issue,we propose an alternative approach:In Stage 1,we select potential risk genotypes inste

... Show More
View Publication
Publication Date
Fri Apr 01 2022
Journal Name
Baghdad Science Journal
Improved Firefly Algorithm with Variable Neighborhood Search for Data Clustering
...Show More Authors

Among the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On the

... Show More
View Publication Preview PDF
Scopus (9)
Crossref (3)
Scopus Clarivate Crossref
Publication Date
Wed Jun 29 2022
Journal Name
Journal Of Al-rafidain University College For Sciences ( Print Issn: 1681-6870 ,online Issn: 2790-2293 )
The Use Of Genetic Algorithm In Estimating The Parameter Of Finite Mixture Of Linear Regression
...Show More Authors

The estimation of the parameters of linear regression is based on the usual Least Square method, as this method is based on the estimation of several basic assumptions. Therefore, the accuracy of estimating the parameters of the model depends on the validity of these hypotheses. The most successful technique was the robust estimation method which is minimizing maximum likelihood estimator (MM-estimator) that proved its efficiency in this purpose. However, the use of the model becomes unrealistic and one of these assumptions is the uniformity of the variance and the normal distribution of the error. These assumptions are not achievable in the case of studying a specific problem that may include complex data of more than one model. To

... Show More
View Publication
Crossref
Publication Date
Thu Sep 30 2021
Journal Name
Journal Of Economics And Administrative Sciences
Comparison of Some Methods for Estimating Mixture of Linear Regression Models with Application
...Show More Authors

 A mixture model is used to model data that come from more than one component. In recent years, it became an effective tool in drawing inferences about the complex data that we might come across in real life. Moreover, it can represent a tremendous confirmatory tool in classification observations based on similarities amongst them. In this paper, several mixture regression-based methods were conducted under the assumption that the data come from a finite number of components. A comparison of these methods has been made according to their results in estimating component parameters. Also, observation membership has been inferred and assessed for these methods. The results showed that the flexible mixture model outperformed the

... Show More
View Publication Preview PDF
Crossref
Publication Date
Thu Sep 30 2021
Journal Name
Journal Of Economics And Administrative Sciences
Comparison of Some Methods for Estimating Mixture of Linear Regression Models with Application
...Show More Authors

 A mixture model is used to model data that come from more than one component. In recent years, it became an effective tool in drawing inferences about the complex data that we might come across in real life. Moreover, it can represent a tremendous confirmatory tool in classification observations based on similarities amongst them. In this paper, several mixture regression-based methods were conducted under the assumption that the data come from a finite number of components. A comparison of these methods has been made according to their results in estimating component parameters. Also, observation membership has been inferred and assessed for these methods. The results showed that the flexible mixture model outperformed the others

... Show More
Crossref
Publication Date
Fri Dec 30 2022
Journal Name
Journal Of Mathematics
Estimation of Parameters of Finite Mixture of Rayleigh Distribution by the Expectation-Maximization Algorithm
...Show More Authors

In the lifetime process in some systems, most data cannot belong to one single population. In fact, it can represent several subpopulations. In such a case, the known distribution cannot be used to model data. Instead, a mixture of distribution is used to modulate the data and classify them into several subgroups. The mixture of Rayleigh distribution is best to be used with the lifetime process. This paper aims to infer model parameters by the expectation-maximization (EM) algorithm through the maximum likelihood function. The technique is applied to simulated data by following several scenarios. The accuracy of estimation has been examined by the average mean square error (AMSE) and the average classification success rate (ACSR). T

... Show More
View Publication Preview PDF
Scopus (2)
Crossref (1)
Scopus Clarivate Crossref
Publication Date
Wed Jan 11 2023
Journal Name
Mathematical Problems In Engineering
Bayesian Methods for Estimation the Parameters of Finite Mixture of Inverse Rayleigh Distribution
...Show More Authors

Methods of estimating statistical distribution have attracted many researchers when it comes to fitting a specific distribution to data. However, when the data belong to more than one component, a popular distribution cannot be fitted to such data. To tackle this issue, mixture models are fitted by choosing the correct number of components that represent the data. This can be obvious in lifetime processes that are involved in a wide range of engineering applications as well as biological systems. In this paper, we introduce an application of estimating a finite mixture of Inverse Rayleigh distribution by the use of the Bayesian framework when considering the model as Markov chain Monte Carlo (MCMC). We employed the Gibbs sampler and

... Show More
View Publication Preview PDF
Scopus (1)
Scopus Clarivate Crossref
Publication Date
Sun Jul 02 2023
Journal Name
Iraqi Journal Of Science
A secure Search over Distributed Data
...Show More Authors

In recent years, due to the economic benefits and technical advances of cloud
computing, huge amounts of data have been outsourced in the cloud. To protect the
privacy of their sensitive data, data owners have to encrypt their data prior
outsourcing it to the untrusted cloud servers. To facilitate searching over encrypted
data, several approaches have been provided. However, the majority of these
approaches handle Boolean search but not ranked search; a widely accepted
technique in the current information retrieval (IR) systems to retrieve only the top–k
relevant files. In this paper, propose a distributed secure ranked search scheme over
the encrypted cloud servers. Such scheme allows for the authorized user to

... Show More
View Publication Preview PDF
Publication Date
Sat Dec 31 2022
Journal Name
Journal Of Economics And Administrative Sciences
Using Some Estimation Methods for Mixed-Random Panel Data Regression Models with Serially Correlated Errors with Application
...Show More Authors

This research includes the study of dual data models with mixed random parameters, which contain two types of parameters, the first is random and the other is fixed. For the random parameter, it is obtained as a result of differences in the marginal tendencies of the cross sections, and for the fixed parameter, it is obtained as a result of differences in fixed limits, and random errors for each section. Accidental bearing the characteristic of heterogeneity of variance in addition to the presence of serial correlation of the first degree, and the main objective in this research is the use of efficient methods commensurate with the paired data in the case of small samples, and to achieve this goal, the feasible general least squa

... Show More
View Publication Preview PDF
Publication Date
Fri Jun 30 2023
Journal Name
Iraqi Journal Of Science
Spatio-Temporal Mixture Model for Identifying Risk Levels of COVID-19 Pandemic in Iraq
...Show More Authors

     This paper focuses on choosing a spatial mixture model with implicitly includes the time to represent the relative risks of COVID-19 pandemic using an appropriate model selection criterion. For this purpose, a more recent criterion so-called the widely Akaike information criterion (WAIC) is used which we believe that its use so limitedly in the context of relative risk modelling. In addition, a graphical method is adopted that is based on a spatial-temporal predictive posterior distribution to select the best model yielding the best predictive accuracy. By applying this model selection criterion, we seek to identify the levels of relative risk, which implicitly represents the determination of the number of the model components o

... Show More
View Publication
Scopus Crossref