Search for risk haplotype segments with GWAS data by use of finite mixture models

ALI Fadhaa; Jian Zhang

doi:10.4310/SII.2016.v9.n3.a2

Details

Publication Date

Fri Jan 01 2016

Journal Name

Statistics And Its Interface

Volume

9

DOI

10.4310/SII.2016.v9.n3.a2

Choose Citation Style

Statistics

View publication

11

Statistics

Search for risk haplotype segments with GWAS data by use of finite mixture models

ALI Fadhaa

Jian Zhang

...Show More Authors

The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.

View Publication

Publication Date

Wed Oct 09 2024

Journal Name

Engineering, Technology & Applied Science Research

Improving Pre-trained CNN-LSTM Models for Image Captioning with Hyper-Parameter Optimization

CNN pre-trained models

LSTM

activation function

hyper-parameters

overfitting

Nuha M.

Nada

...Show More Authors

The issue of image captioning, which comprises automatic text generation to understand an image’s visual information, has become feasible with the developments in object recognition and image classification. Deep learning has received much interest from the scientific community and can be very useful in real-world applications. The proposed image captioning approach involves the use of Convolution Neural Network (CNN) pre-trained models combined with Long Short Term Memory (LSTM) to generate image captions. The process includes two stages. The first stage entails training the CNN-LSTM models using baseline hyper-parameters and the second stage encompasses training CNN-LSTM models by optimizing and adjusting the hyper-parameters of

View Publication

(3)

(4)

Publication Date

Sun Oct 01 2023

Journal Name

Baghdad Science Journal

Using VGG Models with Intermediate Layer Feature Maps for Static Hand Gesture Recognition

Convolutional Neural Networks

Deep Learning

Hand Gesture Recognition

VGG-16

VGG-19.

Osamah Y.

Bashar S

Ayad R.

...Show More Authors

A hand gesture recognition system provides a robust and innovative solution to nonverbal communication through human–computer interaction. Deep learning models have excellent potential for usage in recognition applications. To overcome related issues, most previous studies have proposed new model architectures or have fine-tuned pre-trained models. Furthermore, these studies relied on one standard dataset for both training and testing. Thus, the accuracy of these studies is reasonable. Unlike these works, the current study investigates two deep learning models with intermediate layers to recognize static hand gesture images. Both models were tested on different datasets, adjusted to suit the dataset, and then trained under different m

View Publication Preview PDF

(11)

(3)

Publication Date

Wed Mar 01 2023

Journal Name

Journal Of Engineering

Stiffness Characteristics of Pile Models for Cement Improving Sandy Soil by Low-Pressure Injection Laboratory Setup

Sand improvement

Jet grouting

Laboratory setup

Unconfined compression test

Samir

Mahmod

...Show More Authors

Soil improvement has developed as a realistic solution for enhancing soil properties so that structures can be constructed to meet project engineering requirements due to the limited availability of construction land in urban centers. The jet grouting method for soil improvement is a novel geotechnical alternative for problematic soils for which conventional foundation designs cannot provide acceptable and lasting solutions. The paper's methodology was based on constructing pile models using a low-pressure injection laboratory setup built and made locally to simulate the operation of field equipment. The setup design was based on previous research that systematically conducted unconfined compression testing (U.C.Ts.). Th

View Publication Preview PDF

(1)

Publication Date

Wed Nov 01 2017

Journal Name

Journal Of Economics And Administrative Sciences

Applied Study on Analysis of Fixed, Random and Mixed Panel Data Models Measured at specific time intervals

Fixed Panel Data

Random Panel Data

Mixed Panel Data

Lagrange multiplier

The Coefficient of determination .

رجاء كامل

...Show More Authors

This research sought to present a concept of cross-sectional data models, A crucial double data to take the impact of the change in time and obtained from the measured phenomenon of repeated observations in different time periods, Where the models of the panel data were defined by different types of fixed , random and mixed, and Comparing them by studying and analyzing the mathematical relationship between the influence of time with a set of basic variables Which are the main axes on which the research is based and is represented by the monthly revenue of the working individual and the profits it generates, which represents the variable response And its relationship to a set of explanatory variables represented by the

View Publication Preview PDF

Publication Date

Thu Oct 18 2018

Journal Name

Lambert Academic Publishing

Mathematical Models For Contamination Soil

Ghazi F.F.

...Show More Authors

ENGLISH

Publication Date

Wed Jan 01 2020

Journal Name

International Journal Of Computing

Twitter Location-Based Data: Evaluating the Methods of Data Collection Provided by Twitter Api

location data

Social media

Twitter

N.A.

Haneen

...Show More Authors

Twitter data analysis is an emerging field of research that utilizes data collected from Twitter to address many issues such as disaster response, sentiment analysis, and demographic studies. The success of data analysis relies on collecting accurate and representative data of the studied group or phenomena to get the best results. Various twitter analysis applications rely on collecting the locations of the users sending the tweets, but this information is not always available. There are several attempts at estimating location based aspects of a tweet. However, there is a lack of attempts on investigating the data collection methods that are focused on location. In this paper, we investigate the two methods for obtaining location-based dat

View Publication

(4)

(1)

Publication Date

Sat Dec 30 2023

Journal Name

Journal Of Economics And Administrative Sciences

The Cluster Analysis by Using Nonparametric Cubic B-Spline Modeling for Longitudinal Data

البيانات الطولية

نموذج الشرائح B-spline التكعيبية اللامعلمية

التحليل العنقودي

طريقة الاتجاه المتناوب لخوارزمية المضاعف ADMM.

Longitudinal Data

Nonparametric Cubic B-Spline

Cluster Analysis

The Alternating Direction Method for Multiplier Algorithm ADMM.

Noor

Suhail

...Show More Authors

Longitudinal data is becoming increasingly common, especially in the medical and economic fields, and various methods have been analyzed and developed to analyze this type of data.

In this research, the focus was on compiling and analyzing this data, as cluster analysis plays an important role in identifying and grouping co-expressed subfiles over time and employing them on the nonparametric smoothing cubic B-spline model, which is characterized by providing continuous first and second derivatives, resulting in a smoother curve with fewer abrupt changes in slope. It is also more flexible and can pick up on more complex patterns and fluctuations in the data.

The longitudinal balanced data profile was compiled into subgroup

View Publication Preview PDF

Publication Date

Sun Feb 25 2024

Journal Name

Baghdad Science Journal

An Adaptive Harmony Search Part-of-Speech tagger for Square Hmong Corpus

Harmony Search Algorithm

Low-resource language

Optimization

Part-of-Speech tagging

Unknown words

Di-Wen

Shao-Qiang

Sharifah Zarith Rahmah

Li-Ping

Feng

Pan

...Show More Authors

Data-driven models perform poorly on part-of-speech tagging problems with the square Hmong language, a low-resource corpus. This paper designs a weight evaluation function to reduce the influence of unknown words. It proposes an improved harmony search algorithm utilizing the roulette and local evaluation strategies for handling the square Hmong part-of-speech tagging problem. The experiment shows that the average accuracy of the proposed model is 6%, 8% more than HMM and BiLSTM-CRF models, respectively. Meanwhile, the average F1 of the proposed model is also 6%, 3% more than HMM and BiLSTM-CRF models, respectively.

View Publication Preview PDF

(4)

(2)

Publication Date

Sun Jun 12 2011

Journal Name

Baghdad Science Journal

Development Binary Search Algorithm

Big Oh ( A notation formally describing the set of all functions which are bounded above by a nominated function ).

Ragheed D.

...Show More Authors

There are many methods of searching large amount of data to find one particular piece of information. Such as find name of person in record of mobile. Certain methods of organizing data make the search process more efficient the objective of these methods is to find the element with least cost (least time). Binary search algorithm is faster than sequential and other commonly used search algorithms. This research develops binary search algorithm by using new structure called Triple, structure in this structure data are represented as triple. It consists of three locations (1-Top, 2-Left, and 3-Right) Binary search algorithm divide the search interval in half, this process makes the maximum number of comparisons (Average case com

View Publication Preview PDF

Publication Date

Sun Sep 01 2013

Journal Name

Journal Of Economics And Administrative Sciences

Use some probability amputated models to study the characteristics of health payments in the Iraqi Insurance Company

Insurance- Truncated distributions- Upper truncated lognormal model- Upper truncated Compound Weibull model- Maximum likelihood method.

قيس سبع

ثائرة نجم

...Show More Authors

Abstract

Due to the lack of previous statistical study of the behavior of payments, specifically health insurance, which represents the largest proportion of payments in the general insurance companies in Iraq, this study was selected and applied in the Iraqi insurance company.

In order to find the convenient model representing the health insurance payments, we initially detected two probability models by using (Easy Fit) software:

First, a single Lognormal for the whole sample and the other is a Compound Weibull for the two Sub samples (small payments and large payments), and we focused on the compoun

View Publication Preview PDF

(2)

1 2 ... 7 8 9 10 ... 2691 2692