The hydrological process has a dynamic nature characterised by randomness and complex phenomena. The application of machine learning (ML) models in forecasting river flow has grown rapidly. This is owing to their capacity to simulate the complex phenomena associated with hydrological and environmental processes. Four different ML models were developed for river flow forecasting located in semiarid region, Iraq. The effectiveness of data division influence on the ML models process was investigated. Three data division modeling scenarios were inspected including 70%–30%, 80%–20, and 90%–10%. Several statistical indicators are computed to verify the performance of the models. The results revealed the potential of the hybridized support vector regression model with a genetic algorithm (SVR-GA) over the other ML forecasting models for monthly river flow forecasting using 90%–10% data division. In addition, it was found to improve the accuracy in forecasting high flow events. The unique architecture of developed SVR-GA due to the ability of the GA optimizer to tune the internal parameters of the SVR model provides a robust learning process. This has made it more efficient in forecasting stochastic river flow behaviour compared to the other developed hybrid models.
Thyroid disease is a common disease affecting millions worldwide. Early diagnosis and treatment of thyroid disease can help prevent more serious complications and improve long-term health outcomes. However, thyroid disease diagnosis can be challenging due to its variable symptoms and limited diagnostic tests. By processing enormous amounts of data and seeing trends that may not be immediately evident to human doctors, Machine Learning (ML) algorithms may be capable of increasing the accuracy with which thyroid disease is diagnosed. This study seeks to discover the most recent ML-based and data-driven developments and strategies for diagnosing thyroid disease while considering the challenges associated with imbalanced data in thyroid dise
... Show MoreThe huge amount of documents in the internet led to the rapid need of text classification (TC). TC is used to organize these text documents. In this research paper, a new model is based on Extreme Machine learning (EML) is used. The proposed model consists of many phases including: preprocessing, feature extraction, Multiple Linear Regression (MLR) and ELM. The basic idea of the proposed model is built upon the calculation of feature weights by using MLR. These feature weights with the extracted features introduced as an input to the ELM that produced weighted Extreme Learning Machine (WELM). The results showed a great competence of the proposed WELM compared to the ELM.
Permeability estimation is a vital step in reservoir engineering due to its effect on reservoir's characterization, planning for perforations, and economic efficiency of the reservoirs. The core and well-logging data are the main sources of permeability measuring and calculating respectively. There are multiple methods to predict permeability such as classic, empirical, and geostatistical methods. In this research, two statistical approaches have been applied and compared for permeability prediction: Multiple Linear Regression and Random Forest, given the (M) reservoir interval in the (BH) Oil Field in the northern part of Iraq. The dataset was separated into two subsets: Training and Testing in order to cross-validate the accuracy
... Show MoreStatistical learning theory serves as the foundational bedrock of Machine learning (ML), which in turn represents the backbone of artificial intelligence, ushering in innovative solutions for real-world challenges. Its origins can be linked to the point where statistics and the field of computing meet, evolving into a distinct scientific discipline. Machine learning can be distinguished by its fundamental branches, encompassing supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. Within this tapestry, supervised learning takes center stage, divided in two fundamental forms: classification and regression. Regression is tailored for continuous outcomes, while classification specializes in c
... Show MoreChemical pollution is a very important issue that people suffer from and it often affects the nature of health of society and the future of the health of future generations. Consequently, it must be considered in order to discover suitable models and find descriptions to predict the performance of it in the forthcoming years. Chemical pollution data in Iraq take a great scope and manifold sources and kinds, which brands it as Big Data that need to be studied using novel statistical methods. The research object on using Proposed Nonparametric Procedure NP Method to develop an (OCMT) test procedure to estimate parameters of linear regression model with large size of data (Big Data) which comprises many indicators associated with chemi
... Show MoreTime series analysis is the statistical approach used to analyze a series of data. Time series is the most popular statistical method for forecasting, which is widely used in several statistical and economic applications. The wavelet transform is a powerful mathematical technique that converts an analyzed signal into a time-frequency representation. The wavelet transform method provides signal information in both the time domain and frequency domain. The aims of this study are to propose a wavelet function by derivation of a quotient from two different Fibonacci coefficient polynomials, as well as a comparison between ARIMA and wavelet-ARIMA. The time series data for daily wind speed is used for this study. From the obtained results, the
... Show MoreDiabetes is one of the increasing chronic diseases, affecting millions of people around the earth. Diabetes diagnosis, its prediction, proper cure, and management are compulsory. Machine learning-based prediction techniques for diabetes data analysis can help in the early detection and prediction of the disease and its consequences such as hypo/hyperglycemia. In this paper, we explored the diabetes dataset collected from the medical records of one thousand Iraqi patients. We applied three classifiers, the multilayer perceptron, the KNN and the Random Forest. We involved two experiments: the first experiment used all 12 features of the dataset. The Random Forest outperforms others with 98.8% accuracy. The second experiment used only five att
... Show MoreFor the duration of the last few many years many improvement in computer technology, software program programming and application production had been followed with the aid of diverse engineering disciplines. Those trends are on the whole focusing on synthetic intelligence strategies. Therefore, a number of definitions are supplied, which recognition at the concept of artificial intelligence from exclusive viewpoints. This paper shows current applications of artificial intelligence (AI) that facilitate cost management in civil engineering tasks. An evaluation of the artificial intelligence in its precise partial branches is supplied. These branches or strategies contributed to the creation of a sizable group of fashions s
... Show MoreThis paper presents a grey model GM(1,1) of the first rank and a variable one and is the basis of the grey system theory , This research dealt properties of grey model and a set of methods to estimate parameters of the grey model GM(1,1) is the least square Method (LS) , weighted least square method (WLS), total least square method (TLS) and gradient descent method (DS). These methods were compared based on two types of standards: Mean square error (MSE), mean absolute percentage error (MAPE), and after comparison using simulation the best method was applied to real data represented by the rate of consumption of the two types of oils a Heavy fuel (HFO) and diesel fuel (D.O) and has been applied several tests to
... Show More