The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.
Visual analytics becomes an important approach for discovering patterns in big data. As visualization struggles from high dimensionality of data, issues like concept hierarchy on each dimension add more difficulty and make visualization a prohibitive task. Data cube offers multi-perspective aggregated views of large data sets and has important applications in business and many other areas. It has high dimensionality, concept hierarchy, vast number of cells, and comes with special exploration operations such as roll-up, drill-down, slicing and dicing. All these issues make data cubes very difficult to visually explore. Most existing approaches visualize a data cube in 2D space and require preprocessing steps. In this paper, we propose a visu
... Show MoreThis research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB
... Show MoreIn high-dimensional semiparametric regression, balancing accuracy and interpretability often requires combining dimension reduction with variable selection. This study intro- duces two novel methods for dimension reduction in additive partial linear models: (i) minimum average variance estimation (MAVE) combined with the adaptive least abso- lute shrinkage and selection operator (MAVE-ALASSO) and (ii) MAVE with smoothly clipped absolute deviation (MAVE-SCAD). These methods leverage the flexibility of MAVE for sufficient dimension reduction while incorporating adaptive penalties to en- sure sparse and interpretable models. The performance of both methods is evaluated through simulations using the mean squared error and variable selection cri
... Show MoreIt is an established fact that substantial amounts of oil usually remain in a reservoir after primary and secondary processes. Therefore; there is an ongoing effort to sweep that remaining oil. Field optimization includes many techniques. Horizontal wells are one of the most motivating factors for field optimization. The selection of new horizontal wells must be accompanied with the right selection of the well locations. However, modeling horizontal well locations by a trial and error method is a time consuming method. Therefore; a method of Artificial Neural Network (ANN) has been employed which helps to predict the optimum performance via proposed new wells locations by incorporatin
A finite element is a study that is capable of predicting crack initiation and simulating crack propagation of human bone. The material model is implemented in MATLAB finite element package, which allows extension to any geometry and any load configuration. The fracture mechanics parameters for transverse and longitudinal crack propagation in human bone are analyzed. A fracture toughness as well as stress and strain contour are generated and thoroughly evaluated. Discussion is given on how this knowledge needs to be extended to allow prediction of whole bone fracture from external loading to aid the design of protective systems.
In this research, the nonparametric technique has been presented to estimate the time-varying coefficients functions for the longitudinal balanced data that characterized by observations obtained through (n) from the independent subjects, each one of them is measured repeatedly by group of specific time points (m). Although the measurements are independent among the different subjects; they are mostly connected within each subject and the applied techniques is the Local Linear kernel LLPK technique. To avoid the problems of dimensionality, and thick computation, the two-steps method has been used to estimate the coefficients functions by using the two former technique. Since, the two-
... Show MoreTo finalize any construction investment project, it would be necessary to identify the most significant problems and obstacles that lead to project reluctance and stalling. Unexpected events and conflicts may have disrupted these strategies and impacted project development. Due to the high initial investment costs of construction projects, crises can have an immediate impact, resulting in significant financial losses. The 2014 financial crisis was one of the most prominent crises that Iraq faced, which prompted the researcher to identify and evaluate those obstacles through this research and questionnaires using Pareto scientific theory to exclude factors that do not contribute to project lag. It was discovered that 28 o
... Show MoreThis paper focus on study the variations of monthly tropospheric NO2 concentrations over three Iraqi cities Baghdad (33.3° N, 44.4° E), Basrah (30.56° N, 47.8° E) and Erbil (36.3° N, 44.06° E). Monthly NO2 retrievals from the Ozone Monitoring Instrument (OMI) onboard Aura satellite during the period from October 2004 to March 2013 have been used. The results show a high monthly and annual NO2 concentrations at Baghdad than Basra and Erbil may be attribute to high densely populations and a high economic activity. During the whole period, Baghdad, Basrah and Erbil were exhibited an average of NO2 (8.1±2.5), (3.7±1.3) and (3.3±1.7) in unit 1015 molecules
... Show More