The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.
Linear discriminant analysis and logistic regression are the most widely used in multivariate statistical methods for analysis of data with categorical outcome variables .Both of them are appropriate for the development of linear classification models .linear discriminant analysis has been that the data of explanatory variables must be distributed multivariate normal distribution. While logistic regression no assumptions on the distribution of the explanatory data. Hence ,It is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions.
In this paper we have been focus for the comparison between three forms for classification data belongs
... Show MoreBasic orientation is to look at identifying conceptual perspective to market self-research and descriptive, as has the marketing theme for the same attention in the practical side before endo scopic In recent years, is marketing an integrated and holistic included many areas not limited to the marketing of goods and services, and even included the marketing of religion, politics and individuals for themselves, as the awareness and concepts that seep into the soul of man from its inception until his arrival to the stage of owning a level of skills or expertise, scientific or all of those things degrees mixed with ambition and aspiration for self-realization takes way to search for opportunities or created, often observe individual
... Show MoreThis paper provides an attempt for modeling rate of penetration (ROP) for an Iraqi oil field with aid of mud logging data. Data of Umm Radhuma formation was selected for this modeling. These data include weight on bit, rotary speed, flow rate and mud density. A statistical approach was applied on these data for improving rate of penetration modeling. As result, an empirical linear ROP model has been developed with good fitness when compared with actual data. Also, a nonlinear regression analysis of different forms was attempted, and the results showed that the power model has good predicting capability with respect to other forms.
The aim of this research was to indicate the opinion of the Iraqi consumer awareness of the risks associated with consuming canned food, the questionnaire was included 20 questions for label information, consumer culture, shopping, marketing, awareness and knowledge as a tool to survey the opinions of 300 consumers in Baghdad, the data was analyzed by using percentage, weighted mean, and weight percent, the results obtained showed that the Iraqi consumer need more information, training and guidance programs in food safety handling issue for canned food, especially in analysis of label information and growing of consumer culture for shopping, right marketing, awareness and knowledge.
With the growth of mobile phones, short message service (SMS) became an essential text communication service. However, the low cost and ease use of SMS led to an increase in SMS Spam. In this paper, the characteristics of SMS spam has studied and a set of features has introduced to get rid of SMS spam. In addition, the problem of SMS spam detection was addressed as a clustering analysis that requires a metaheuristic algorithm to find the clustering structures. Three differential evolution variants viz DE/rand/1, jDE/rand/1, jDE/best/1, are adopted for solving the SMS spam problem. Experimental results illustrate that the jDE/best/1 produces best results over other variants in terms of accuracy, false-positive rate and false-negative
... Show MoreToday, problems of spatial data integration have been further complicated by the rapid development in communication technologies and the increasing amount of available data sources on the World Wide Web. Thus, web-based geospatial data sources can be managed by different communities and the data themselves can vary in respect to quality, coverage, and purpose. Integrating such multiple geospatial datasets remains a challenge for geospatial data consumers. This paper concentrates on the integration of geometric and classification schemes for official data, such as Ordnance Survey (OS) national mapping data, with volunteered geographic information (VGI) data, such as the data derived from the OpenStreetMap (OSM) project. Useful descriptions o
... Show MoreAbstract
Bivariate time series modeling and forecasting have become a promising field of applied studies in recent times. For this purpose, the Linear Autoregressive Moving Average with exogenous variable ARMAX model is the most widely used technique over the past few years in modeling and forecasting this type of data. The most important assumptions of this model are linearity and homogenous for random error variance of the appropriate model. In practice, these two assumptions are often violated, so the Generalized Autoregressive Conditional Heteroscedasticity (ARCH) and (GARCH) with exogenous varia
... Show MoreModern asphalt technology has adopted nanomaterials as an alternative option to assert that asphalt pavement can survive harsh climates and repeated heavy axle loading during service life and prolong pavement life. This work aims to elucidate the behavior of the modified asphalt mixture fracture model and assess the fatigue and Rutting performance of Hot Mix Asphalt (HMA) mixes using the outcomes of indirect Tensile Strength (IDT), Semicircular bend (SCB) and rutting resistance; for this, a single PG (64−16) nanomodified asphalt binder with 5 % SiO2 and TiO2 have been investigated through a series of laboratory tests, including: Resilient modulus, Creep compliance, and tensile strength, SCB, and Flow Number (FN) to study their potential
... Show More 
        