Abstract: The utility of DNA sequencing in diagnosing and prognosis of diseases is vital for assessing the risk of genetic disorders, particularly for asymptomatic individuals with a genetic predisposition. Such diagnostic approaches are integral in guiding health and lifestyle decisions and preparing families with the necessary foreknowledge to anticipate potential genetic abnormalities. The present study explores implementing a define-by-run deep learning (DL) model optimized using the Tree-structured Parzen estimator algorithm to enhance the precision of genetic diagnostic tools. Unlike conventional models, the define-by-run model bolsters accuracy through dynamic adaptation to data during the learning process and iterative optimization of critical hyperparameters, such as layer count, neuron count per layer, learning rate, and batch size. Utilizing a diverse dataset comprising DNA sequences fromtwo distinct groups: patients diagnosed with breast cancer and a control group of healthy individuals. The model showcased remarkable performance, with accuracy, precision, recall, F1-score, and area under the curve metrics reaching 0.871, 0.872, 0.871, 0.872, and 0.95, respectively, outperforming previous models. These findings underscore the significant potential of DL techniques in amplifying the accuracy of disease diagnosis and prognosis through DNA sequencing, indicating substantial advancements in personalized medicine and genetic counseling. Collectively, the findings of this investigation suggest that DL presents transformative potential in the landscape of genetic disorder diagnosis and management.
The precise classification of DNA sequences is pivotal in genomics, holding significant implications for personalized medicine. The stakes are particularly high when classifying key genetic markers such as BRAC, related to breast cancer susceptibility; BRAF, associated with various malignancies; and KRAS, a recognized oncogene. Conventional machine learning techniques often necessitate intricate feature engineering and may not capture the full spectrum of sequence dependencies. To ameliorate these limitations, this study employs an adapted UNet architecture, originally designed for biomedical image segmentation, to classify DNA sequences.The attention mechanism was also tested LONG WITH u-Net architecture to precisely classify DNA sequences
... Show MoreMolecular barcoding was widely recognized as a powerful tool for the identification of organisms during the past decade; the aim of this study is to use the molecular approach to identify the diatoms by using the environmental DNA. The diatom specimens were taken from Tigris River. The environmental DNA(e DNA) extraction and analysis of sequences using the Next Generation Sequencing (NGS) method showed the highest percentage of epipelic diatom genera including Achnanthidium minutissimum (Kützing) Czarnecki, 1994 (21.1%), Cocconeis placentula Ehrenberg, 1838 (21.3%) and Nitzschia palea (Kützing) W. Smith, 1856 (16.3%).
Five species of diatoms: Achnanthidiu
... Show MoreIn this study, out of 50 isolates of some nosocomial infections from some Baghdad hospitals, only 13 (26%) were identified as Escherichia coli. Depending on selective media, morphological and biochemical tests the species was then confirmed by molecular methods. Later on antimicrobial resistance test was performed by the Kirby-Bauer method. The molecular characterization of blaTEM and blaCTX-M genes in different clinical isolates of E. coli was done through polymerase chain reaction (PCR) by utilizing special primers. These genes were positive to only 4 (30.7%) isolates. The sequence of nucleotides of positive genes was carried out for four isolates. The results showed that there was no vari
... Show MoreIn gene regulation, transcription factors (TFs) play a key function. It transmits genetic information from DNA to messenger RNA during the process of DNA transcription. During this step, the transcription factor binds to a segment of the DNA sequence known as Transcription Factor Binding Sites (TFBS). The goal of this study is to build a model that predicts whether or not a DNA binding site attaches to a certain transcription factor (TF). TFs are regulatory molecules that bind to particular sequence motifs in the gene to induce or restrict targeted gene transcription. Two classification methods will be used, which are support vector machine (SVM) and kernel logistic regression (KLR). Moreover, the KLR algorithm depends on another regress
... Show MoreIntrusion-detection systems (IDSs) aim at detecting attacks against computer systems and networks or, in general, against information systems. Most of the diseases in human body are discovered through Deoxyribonucleic Acid (DNA) investigations. In this paper, the DNA sequence is utilized for intrusion detection by proposing an approach to detect attacks in network. The proposed approach is a misuse intrusion detection that consists of three stages. First, a DNA sequence for a network traffic taken from Knowledge Discovery and Data mining (KDD Cup 99) is generated. Then, Teiresias algorithm, which is used to detect sequences in human DNA and assist researchers in decoding the human genome, is used to discover the Shortest Tandem Repeat (S
... Show MoreThe fingerprinting DNA method which depends on the unique pattern in this study was employed to detect the hydatid cyst of Echinococcus granulosus and to determine the genetic variation among their strains in different intermediate hosts (cows and sheep). The unique pattern represents the number of amplified bands and their molecular weights with specialized sequences to one sample which different from the other samples. Five hydatitd cysts samples from cows and sheep were collected, genetic analysis for isolated DNA was done using PCR technique and Random Amplified Polymorphic DNA reaction(RAPD) depending on (4) random primers, and the results showed:
... Show MoreAs the diversity and characteristics of Trichoderma species are difficult to determine using morphological methods, henceforth molecular tools are crucial. This study utilized random amplified polymorphic DNA (RAPD) technique to investigate the genetic diversity of Trichoderma with sexual phase Hypocrea and to identify similarities and differences in the phylogenetic tree. Nine Iraqi Trichoderma strains (four strains of T. atroviride, one strain of Hypocrea lixii, two strains of T. gamsii and two strains of T. longibriantium) were examined in this research. The genomic DNA of each species was extracted and amplified with each of the fiv
... Show MoreTo evaluate and improve the efficiency of photovoltaic solar modules connected with linear pipes for water supply, a three-dimensional numerical simulation is created and simulated via commercial software (Ansys-Fluent). The optimization utilizes the principles of the 1st and 2nd laws of thermodynamics by employing the Response Surface Method (RSM). Various design parameters, including the coolant inlet velocity, tube diameter, panel dimensions, and solar radiation intensity, are systematically varied to investigate their impacts on energetic and exergitic efficiencies and destroyed exergy. The relationship between the design parameters and the system responses is validated through the development of a predictive model. Both single and mult
... Show MoreThe proposal of nonlinear models is one of the most important methods in time series analysis, which has a wide potential for predicting various phenomena, including physical, engineering and economic, by studying the characteristics of random disturbances in order to arrive at accurate predictions.
In this, the autoregressive model with exogenous variable was built using a threshold as the first method, using two proposed approaches that were used to determine the best cutting point of [the predictability forward (forecasting) and the predictability in the time series (prediction), through the threshold point indicator]. B-J seasonal models are used as a second method based on the principle of the two proposed approaches in dete
... Show More