The region-based association analysis has been proposed to capture the collective behavior of sets of variants by testing the association of each set instead of individual variants with the disease. Such an analysis typically involves a list of unphased multiple-locus genotypes with potentially sparse frequencies in cases and controls. To tackle the problem of the sparse distribution, a two-stage approach was proposed in literature: In the first stage, haplotypes are computationally inferred from genotypes, followed by a haplotype coclassification. In the second stage, the association analysis is performed on the inferred haplotype groups. If a haplotype is unevenly distributed between the case and control samples, this haplotype is labeled as a risk haplotype. Unfortunately, the in-silico reconstruction of haplotypes might produce a proportion of false haplotypes which hamper the detection of rare but true haplotypes. Here, to address the issue, we propose an alternative approach: In Stage 1, we cluster genotypes instead of inferred haplotypes and estimate the risk genotypes based on a finite mixture model. In Stage 2, we infer risk haplotypes from risk genotypes inferred from the previous stage. To estimate the finite mixture model, we propose an EM algorithm with a novel data partition-based initialization. The performance of the proposed procedure is assessed by simulation studies and a real data analysis. Compared to the existing multiple Z-test procedure, we find that the power of genome-wide association studies can be increased by using the proposed procedure.
Titanium alloys are broadly used in the medical and aerospace sectors. However, they are categorized within the hard-to-machine alloys ascribed to their higher chemical reactivity and lower thermal conductivity. This aim of this research was to study the impact of the dry-end-milling process with an uncoated tool on the produced surface roughness of Ti6Al4V alloy. This research aims to study the impact of the dry-end milling process with an uncoated tool on the produced surface roughness of Ti6Al4V alloy. Also, it seeks to develop a new hybrid neural model based on the training back propagation neural network (BPNN) with swarm optimization-gravitation search hybrid algorithms (PSO-GS
Nanopesticides are novel plant protection products offering numerous benefits. Because nanoparticles behave differently from dissolved chemicals, the environmental risks of these materials could differ from conventional pesticides. We used soil–earthworm systems to compare the fate and uptake of analytical‐grade bifenthrin to that of bifenthrin in traditional and nanoencapsulated formulations. Apparent sorption coefficients for bifenthrin were up to 3.8 times lower in the nano treatments than in the non‐nano treatments, whereas dissipation half‐lives of the nano treatments were up to 2 times longer. Earthworms in the nano treatments accumulated approximately 50% more b
In data mining, classification is a form of data analysis that can be used to extract models describing important data classes. Two of the well known algorithms used in data mining classification are Backpropagation Neural Network (BNN) and Naïve Bayesian (NB). This paper investigates the performance of these two classification methods using the Car Evaluation dataset. Two models were built for both algorithms and the results were compared. Our experimental results indicated that the BNN classifier yield higher accuracy as compared to the NB classifier but it is less efficient because it is time-consuming and difficult to analyze due to its black-box implementation.
Cloud-based Electronic Health Records (EHRs) have seen a substantial increase in usage in recent years, especially for remote patient monitoring. Researchers are interested in investigating the use of Healthcare 4.0 in smart cities. This involves using Internet of Things (IoT) devices and cloud computing to remotely access medical processes. Healthcare 4.0 focuses on the systematic gathering, merging, transmission, sharing, and retention of medical information at regular intervals. Protecting the confidential and private information of patients presents several challenges in terms of thwarting illegal intrusion by hackers. Therefore, it is essential to prioritize the protection of patient medical data that is stored, accessed, and shared on
... Show MoreMachine learning has a significant advantage for many difficulties in the oil and gas industry, especially when it comes to resolving complex challenges in reservoir characterization. Permeability is one of the most difficult petrophysical parameters to predict using conventional logging techniques. Clarifications of the work flow methodology are presented alongside comprehensive models in this study. The purpose of this study is to provide a more robust technique for predicting permeability; previous studies on the Bazirgan field have attempted to do so, but their estimates have been vague, and the methods they give are obsolete and do not make any concessions to the real or rigid in order to solve the permeability computation. To
... Show MorePurpose – The Cloud computing (CC) and its services have enabled the information centers of organizations to adapt their informatic and technological infrastructure and making it more appropriate to develop flexible information systems in the light of responding to the informational and knowledge needs of their users. In this context, cloud-data governance has become more complex and dynamic, requiring an in-depth understanding of the data management strategy at these centers in terms of: organizational structure and regulations, people, technology, process, roles and responsibilities. Therefore, our paper discusses these dimensions as challenges that facing information centers in according to their data governance and the impa
... Show MoreThis paper present the fast and robust approach of English text encryption and decryption based on Pascal matrix. The technique of encryption the Arabic or English text or both and show the result when apply this method on plain text (original message) and how will form the intelligible plain text to be unintelligible plain text in order to secure information from unauthorized access and from steel information, an encryption scheme usually uses a pseudo-random enecryption key generated by an algorithm. All this done by using Pascal matrix. Encryption and decryption are done by using MATLAB as programming language and notepad ++to write the input text.This paper present the fast and robust approach of English text encryption and decryption b
... Show MoreUnderwater Wireless Sensor Networks (UWSNs) have emerged as a promising technology for a wide range of ocean monitoring applications. The UWSNs suffer from unique challenges of the underwater environment, such as dynamic and sparse network topology, which can easily lead to a partitioned network. This results in hotspot formation and the absence of the routing path from the source to the destination. Therefore, to optimize the network lifetime and limit the possibility of hotspot formation along the data transmission path, the need to plan a traffic-aware protocol is raised. In this research, we propose a traffic-aware routing protocol called PG-RES, which is predicated on the ideas of Pressure Gradient and RESistance concept. The proposed
... Show MoreA two time step stochastic multi-variables multi-sites hydrological data forecasting model was developed and verified using a case study. The philosophy of this model is to use the cross-variables correlations, cross-sites correlations and the two steps time lag correlations simultaneously, for estimating the parameters of the model which then are modified using the mutation process of the genetic algorithm optimization model. The objective function that to be minimized is the Akiake test value. The case study is of four variables and three sites. The variables are the monthly air temperature, humidity, precipitation, and evaporation; the sites are Sulaimania, Chwarta, and Penjwin, which are located north Iraq. The model performance was
... Show MoreMost of the medical datasets suffer from missing data, due to the expense of some tests or human faults while recording these tests. This issue affects the performance of the machine learning models because the values of some features will be missing. Therefore, there is a need for a specific type of methods for imputing these missing data. In this research, the salp swarm algorithm (SSA) is used for generating and imputing the missing values in the pain in my ass (also known Pima) Indian diabetes disease (PIDD) dataset, the proposed algorithm is called (ISSA). The obtained results showed that the classification performance of three different classifiers which are support vector machine (SVM), K-nearest neighbour (KNN), and Naïve B
... Show More