Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
n this research, several estimators concerning the estimation are introduced. These estimators are closely related to the hazard function by using one of the nonparametric methods namely the kernel function for censored data type with varying bandwidth and kernel boundary. Two types of bandwidth are used: local bandwidth and global bandwidth. Moreover, four types of boundary kernel are used namely: Rectangle, Epanechnikov, Biquadratic and Triquadratic and the proposed function was employed with all kernel functions. Two different simulation techniques are also used for two experiments to compare these estimators. In most of the cases, the results have proved that the local bandwidth is the best for all the types of the kernel boundary func
... Show MoreThis study aims to estimate the accuracy of digital elevation models (DEM) which are created with exploitation of open source Google Earth data and comparing with the widely available DEM datasets, Shuttle Radar Topography Mission (SRTM), version 3, and Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM), version 2. The GPS technique is used in this study to produce digital elevation raster with a high level of accuracy, as reference raster, compared to the DEM datasets. Baghdad University, Al Jadriya campus, is selected as a study area. Besides, 151 reference points were created within the study area to evaluate the results based on the values of RMS.Furthermore, th
... Show MoreWith the development of communication technologies for mobile devices and electronic communications, and went to the world of e-government, e-commerce and e-banking. It became necessary to control these activities from exposure to intrusion or misuse and to provide protection to them, so it's important to design powerful and efficient systems-do-this-purpose. It this paper it has been used several varieties of algorithm selection passive immune algorithm selection passive with real values, algorithm selection with passive detectors with a radius fixed, algorithm selection with passive detectors, variable- sized intrusion detection network type misuse where the algorithm generates a set of detectors to distinguish the self-samples. Practica
... Show MoreThis research had been achieved to identify the image of the subsurface structure representing the Tertiary period in the Galabat Field northeast of Iraq using 2D seismic survey measurements. Synthetic seismograms of the Galabat-3 well were generated in order to identify and pick the reflectors in seismic sections. Structural Images were drawn in the time domain and then converted to the depth domain by using average velocities. Structurally, seismic sections illustrate these reflectors are affected by two reverse faults affected on the Jeribe Formation and the layers below with the increase in the density of the reverse faults in the northern division. The structural maps show Galabat field, which consists of longitudinal Asymmetrical narr
... Show MoreMersing is one of the places that have the potential for wind power development in Malaysia. Researchers often suggest it as an ideal place for generating electricity from wind power. However, before a location is chosen, several factors need to be considered. By analyzing the location ahead of time, resource waste can be avoided and maximum profitability to various parties can be realized. For this study, the focus is to identify the distribution of the wind speed of Mersing and to determine the optimal average of wind speed. This study is critical because the wind speed data for any region has its distribution. It changes daily and by season. Moreover, no determination has been made regarding selecting the average wind speed used for w
... Show MoreHydrocarbon production might cause changes in dynamic reservoir properties. Thus the consideration of the mechanical stability of a formation under different conditions of drilling or production is a very important issue, and basic mechanical properties of the formation should be determined. There is considerable evidence, gathered from laboratory measurements in the field of Rock Mechanics, showing a good correlation between intrinsic rock strength and the dynamic elastic constant determined from sonic-velocity and density measurements. The values of the mechanical properties determined from log data, such as the dynamic elastic constants derived from the measurement of the elastic wave velocities in the material, should be more accurate t
... Show MoreThe paired sample t-test for testing the difference between two means in paired data is not robust against the violation of the normality assumption. In this paper, some alternative robust tests have been suggested by using the bootstrap method in addition to combining the bootstrap method with the W.M test. Monte Carlo simulation experiments were employed to study the performance of the test statistics of each of these three tests depending on type one error rates and the power rates of the test statistics. The three tests have been applied on different sample sizes generated from three distributions represented by Bivariate normal distribution, Bivariate contaminated normal distribution, and the Bivariate Exponential distribution.
This paper focus on study the variations of monthly tropospheric NO2 concentrations over three Iraqi cities Baghdad (33.3° N, 44.4° E), Basrah (30.56° N, 47.8° E) and Erbil (36.3° N, 44.06° E). Monthly NO2 retrievals from the Ozone Monitoring Instrument (OMI) onboard Aura satellite during the period from October 2004 to March 2013 have been used. The results show a high monthly and annual NO2 concentrations at Baghdad than Basra and Erbil may be attribute to high densely populations and a high economic activity. During the whole period, Baghdad, Basrah and Erbil were exhibited an average of NO2 (8.1±2.5), (3.7±1.3) and (3.3±1.7) in unit 1015 molecules
... Show More