The precise classification of DNA sequences is pivotal in genomics, holding significant implications for personalized medicine. The stakes are particularly high when classifying key genetic markers such as BRAC, related to breast cancer susceptibility; BRAF, associated with various malignancies; and KRAS, a recognized oncogene. Conventional machine learning techniques often necessitate intricate feature engineering and may not capture the full spectrum of sequence dependencies. To ameliorate these limitations, this study employs an adapted UNet architecture, originally designed for biomedical image segmentation, to classify DNA sequences.The attention mechanism was also tested LONG WITH u-Net architecture to precisely classify DNA sequences into BRAC, BRAF, and KRAS categories. Our comprehensive methodology includes rigorous data preprocessing, model training, and a multi-faceted evaluation approach. The adapted U-Net model exhibited exceptional performance, achieving an overall accuracy of 0.96. The model also achieved high precision and recall rates across the classes, with precision ranging from 0.93 to 1.00 and recall between 0.95 and 0.97 for the key markers BRAC, BRAF, and KRAS. The F1-score for these critical markers ranged from 0.95 to 0.98. These empirical results substantiate the architecture’s capability to capture local and global features in DNA sequences, affirming its applicability for critical, sequence-based bioinformatics challenges
A substantial portion of today’s multimedia data exists in the form of unstructured text. However, the unstructured nature of text poses a significant task in meeting users’ information requirements. Text classification (TC) has been extensively employed in text mining to facilitate multimedia data processing. However, accurately categorizing texts becomes challenging due to the increasing presence of non-informative features within the corpus. Several reviews on TC, encompassing various feature selection (FS) approaches to eliminate non-informative features, have been previously published. However, these reviews do not adequately cover the recently explored approaches to TC problem-solving utilizing FS, such as optimization techniques.
... Show MoreMost dinoflagellate had a resting cyst in their life cycle. This cyst was developed in unfavorable environmental condition. The conventional method for identifying dinoflagellate cyst in natural sediment requires morphological observation, isolating, germinating and cultivating the cysts. PCR is a highly sensitive method for detecting dinoflagellate cyst in the sediment. The aim of this study is to examine whether CO1 primer could detect DNA of multispecies dinoflagellate cysts in the sediment from our sampling sites. Dinoflagellate cyst DNA was extracted from 16 sediment samples. PCR method using COI primer was running. The sequencing of dinoflagellate cyst DNA was using BLAST. Results showed that there were two clades of dinoflag
... Show MoreThe Sequencing Batch Reactor system (SBR) is a major component of the municipal wastewater biological treatment system and water reclamation that provides high-quality water that could be reused in restricted plants that which require large quantities of water despite the lack of water. The research aims to investigate the performance of a pilot plant SBR unit under real operation conditions that was installed and operated in Al-Rustamiya Wastewater Treatment Plant (WWTP), Baghdad, Iraq. Results showed that the BOD5/COD ratio of the raw wastewater was within the average value at 0.66 emphasizing the organic nature of the influent flow and hence the amenability to biological treatment. The results also ensured that the treatment pro
... Show MoreBackground: Insertion sequence is a short DNA sequence encode for proteins implicated in the transposition activity. Transposase catalyzes the enzymatic reaction allowing the insertion sequence to +9*lo2 move. ;qqa;.
Objective: To study the sequencing of transposase gene, tnp, IS1216V of S. aureus isolated from food and then compared with that documented in National Center for Biotechnology Information (NCBI).
Methods: Food samples of animal
... Show MoreCrime is considered as an unlawful activity of all kinds and it is punished by law. Crimes have an impact on a society's quality of life and economic development. With a large rise in crime globally, there is a necessity to analyze crime data to bring down the rate of crime. This encourages the police and people to occupy the required measures and more effectively restricting the crimes. The purpose of this research is to develop predictive models that can aid in crime pattern analysis and thus support the Boston department's crime prevention efforts. The geographical location factor has been adopted in our model, and this is due to its being an influential factor in several situations, whether it is traveling to a specific area or livin
... Show MoreThe current study aims to apply the methods of evaluating investment decisions to extract the highest value and reduce the economic and environmental costs of the health sector according to the strategy.In order to achieve the objectives of the study, the researcher relied on the deductive approach in the theoretical aspect by collecting sources and previous studies. He also used the applied practical approach, relying on the data and reports of Amir almuminin Hospital for the period (2017-2031) for the purpose of evaluating investment decisions in the hospital. A set of conclusions, the most important of which is: The failure to apply
... Show MoreIn this paper, a new method of selection variables is presented to select some essential variables from large datasets. The new model is a modified version of the Elastic Net model. The modified Elastic Net variable selection model has been summarized in an algorithm. It is applied for Leukemia dataset that has 3051 variables (genes) and 72 samples. In reality, working with this kind of dataset is not accessible due to its large size. The modified model is compared to some standard variable selection methods. Perfect classification is achieved by applying the modified Elastic Net model because it has the best performance. All the calculations that have been done for this paper are in