The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.
This paper provides an attempt for modeling rate of penetration (ROP) for an Iraqi oil field with aid of mud logging data. Data of Umm Radhuma formation was selected for this modeling. These data include weight on bit, rotary speed, flow rate and mud density. A statistical approach was applied on these data for improving rate of penetration modeling. As result, an empirical linear ROP model has been developed with good fitness when compared with actual data. Also, a nonlinear regression analysis of different forms was attempted, and the results showed that the power model has good predicting capability with respect to other forms.
In the light of the globalization Which surrounds the business environment and whose impact has been reflected on industrial economic units the whole world has become a single market that affects its variables on all units and is affected by the economic contribution of each economic unit as much as its share. The problem of this research is that the use of Pareto analysis enables industrial economic units to diagnose the risks surrounding them , so the main objective of the research was to classify risks into both internal and external types and identify any risks that require more attention.
The research was based on the hypothesis that Pareto analysis used, risks can be identified and addressed before they occur.
... Show MoreBig data analysis is essential for modern applications in areas such as healthcare, assistive technology, intelligent transportation, environment and climate monitoring. Traditional algorithms in data mining and machine learning do not scale well with data size. Mining and learning from big data need time and memory efficient techniques, albeit the cost of possible loss in accuracy. We have developed a data aggregation structure to summarize data with large number of instances and data generated from multiple data sources. Data are aggregated at multiple resolutions and resolution provides a trade-off between efficiency and accuracy. The structure is built once, updated incrementally, and serves as a common data input for multiple mining an
... Show MoreAbstract
Bivariate time series modeling and forecasting have become a promising field of applied studies in recent times. For this purpose, the Linear Autoregressive Moving Average with exogenous variable ARMAX model is the most widely used technique over the past few years in modeling and forecasting this type of data. The most important assumptions of this model are linearity and homogenous for random error variance of the appropriate model. In practice, these two assumptions are often violated, so the Generalized Autoregressive Conditional Heteroscedasticity (ARCH) and (GARCH) with exogenous varia
... Show MoreThe Boltzmann transport equation is solved by using two- terms approximation for pure gases and mixtures. This method of solution is used to calculate the electron energy distribution function and electric transport parameters were evaluated in the range of E/N varying from . 172152110./510.VcmENVcm
The electron energy distribution function of CF4 gas is nearly Maxwellian at (1,2)Td, and when E/N increase the distribution function is non Maxwellian. Also, the mixtures are have different energy values depending on transport energy between electron and molecule through the collisions. Behavior of electrons transport parameters is nearly from the experimental results in references. The drift velocity of electron in carbon tetraflouride i
Refractive indices (nD), viscosities (η) and densities (r) were deliberated for the binary mixtures created by dipropyl amine with 1-octanol, 1-heptanol, 1-hexanol, 1-pentanol and tert-pentyl alcohol at temperature 298.15 K over the perfect installation extent. The function of Redlich-Kister were used to calculate and renovated of the refractive index deviations (∆nD), viscosity deviations (ηE), excess molar Gibbs free energy (∆G*E) and excess molar volumes(Vm E). The standard errors and coefficients were respected by this function. The values of ∆nD, ηE, Vm E and ∆G*E were plotted against mole fraction of dipropyl amine. In all cases the obtained ηE, ∆G*E, Vm E and ∆nD values were negative at 298.15K. Effect of carbon atoms
... Show MoreRefractive indices (nD), viscosities (η) and densities (ρ) were deliberated for the binary mixtures created by dipropyl amine with 1-octanol, 1-heptanol, 1-hexanol, 1-pentanol and tert-pentyl alcohol at temperature 298.15 K over the perfect installation extent. The function of Redlich-Kister were used to calculate and renovated of the refractive index deviations (∆nD), viscosity deviations (ηE), excess molar Gibbs free energy (∆G*E) and excess molar volumes (VmE) The standard errors and coefficients were respected by this function. The values of ∆nD, ηE, VmE and ∆G*E were plotted against mole fraction of dipropyl amine. In all cases the obtained ηE, ∆G*E, VmE and ∆nD values were negative at 298.15K. Effect of carbo
... Show More