Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Precise forecasting of pore pressures is crucial for efficiently planning and drilling oil and gas wells. It reduces expenses and saves time while preventing drilling complications. Since direct measurement of pore pressure in wellbores is costly and time-intensive, the ability to estimate it using empirical or machine learning models is beneficial. The present study aims to predict pore pressure using artificial neural network. The building and testing of artificial neural network are based on the data from five oil fields and several formations. The artificial neural network model is built using a measured dataset consisting of 77 data points of Pore pressure obtained from the modular formation dynamics tester. The input variables
... Show MoreGenerally, statistical methods are used in various fields of science, especially in the research field, in which Statistical analysis is carried out by adopting several techniques, according to the nature of the study and its objectives. One of these techniques is building statistical models, which is done through regression models. This technique is considered one of the most important statistical methods for studying the relationship between a dependent variable, also called (the response variable) and the other variables, called covariate variables. This research describes the estimation of the partial linear regression model, as well as the estimation of the “missing at random” values (MAR). Regarding the
... Show MoreIn recent years, the Global Navigation Satellite Services (GNSS) technology has been frequently employed for monitoring the Earth crust deformation and movement. Such applications necessitate high positional accuracy that can be achieved through processing GPS/GNSS data with scientific software such as BERENSE, GAMIT, and GIPSY-OSIS. Nevertheless, these scientific softwares are sophisticated and have not been published as free open source software. Therefore, this study has been conducted to evaluate an alternative solution, GNSS online processing services, which may obtain this privilege freely. In this study, eight years of GNSS raw data for TEHN station, which located in Iran, have been downloaded from UNAVCO website
... Show MoreThe physical and elastic characteristics of rocks determine rock strengths in general. Rock strength is frequently assessed using porosity well logs such as neutron and sonic logs. The essential criteria for estimating rock mechanic parameters in petroleum engineering research are uniaxial compressive strength and elastic modulus. Indirect estimation using well-log data is necessary to measure these variables. This study attempts to create a single regression model that can accurately forecast rock mechanic characteristics for the Harth Carbonate Formation in the Fauqi oil field. According to the findings of this study, petrophysical parameters are reliable indexes for determining rock mechanical properties having good performance p
... Show MoreUnconfined Compressive Strength is considered the most important parameter of rock strength properties affecting the rock failure criteria. Various research have developed rock strength for specific lithology to estimate high-accuracy value without a core. Previous analyses did not account for the formation's numerous lithologies and interbedded layers. The main aim of the present study is to select the suitable correlation to predict the UCS for hole depth of formation without separating the lithology. Furthermore, the second aim is to detect an adequate input parameter among set wireline to determine the UCS by using data of three wells along ten formations (Tanuma, Khasib, Mishrif, Rumaila, Ahmady, Maudud, Nahr Um
... Show MoreIn this work ,pure and doped(CdO)thin films with different concentration of V2O5x (0.0, 0.05, 0.1 ) wt.% have been prepared on glass substrate at room temperature using Pulse Laser Deposition technique(PLD).The focused Nd:YAG laser beam at 800 mJ with a frequency second radiation at 1064 nm (pulse width 9 ns) repetition frequency (6 Hz), for 500 laser pulses incident on the target surface At first ,The pellets of (CdO)1-x(V2O5)x at different V2O5 contents were sintered to a temperature of 773K for one hours.Then films of (CdO)1-x(V2O5)x have been prepared.The structure of the thin films was examined by using (XRD) analysis..Hall effect has been measured in orded to know the type of conductivity, Finally the solar cell and the effici
... Show MoreIraq is located near the northern tip of the Arabian plate, which is advancing northwards relative to the Eurasian plate, and is predictably, a tectonically active country. Seismic activity in Iraq increased significantly during the last decade. So structural and geotechnical engineers have been giving increasing attention to the design of buildings for earthquake resistance. Dynamic properties play a vital role in the design of structures subjected to seismic load. The main objective of this study is to prepare a data base for the dynamic properties of different soils in seismic active zones in Iraq using the results of cross hole and down hole tests. From the data base collected it has been observed that the average ve
... Show MoreIn this research, several estimators concerning the estimation are introduced. These estimators are closely related to the hazard function by using one of the nonparametric methods namely the kernel function for censored data type with varying bandwidth and kernel boundary. Two types of bandwidth are used: local bandwidth and global bandwidth. Moreover, four types of boundary kernel are used namely: Rectangle, Epanechnikov, Biquadratic and Triquadratic and the proposed function was employed with all kernel functions. Two different simulation techniques are also used for two experiments to compare these estimators. In most of the cases, the results have proved that the local bandwidth is the best for all the
... Show More