Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Abstract
The research Compared two methods for estimating fourparametersof the compound exponential Weibull - Poisson distribution which are the maximum likelihood method and the Downhill Simplex algorithm. Depending on two data cases, the first one assumed the original data (Non-polluting), while the second one assumeddata contamination. Simulation experimentswere conducted for different sample sizes and initial values of parameters and under different levels of contamination. Downhill Simplex algorithm was found to be the best method for in the estimation of the parameters, the probability function and the reliability function of the compound distribution in cases of natural and contaminateddata.
... Show More
Sawa Lake is one of the unique lakes in Iraq. It is located in the southwestern part of Iraq. It is one of the closed lakes, as no surface water source works to feed the lake. The lake feeds on groundwater. The source of this groundwater is the Dammam Basin. During the past ten years, The lake has had many changes, which led to a decrease in water levels. This also led attention to study of the causes of these changes. Many types of research were presented in the study of the state of the lake. This research used remote sensing images from Landsat 8 OLI to monitor the changes during 2020-2021 by applying the NDWI equation to extract water area from image data. The results of the areas were obtained from a special report by Normalized Dif
... Show MoreIn many oil-recovery systems, relative permeabilities (kr) are essential flow factors that affect fluid dispersion and output from petroleum resources. Traditionally, taking rock samples from the reservoir and performing suitable laboratory studies is required to get these crucial reservoir properties. Despite the fact that kr is a function of fluid saturation, it is now well established that pore shape and distribution, absolute permeability, wettability, interfacial tension (IFT), and saturation history all influence kr values. These rock/fluid characteristics vary greatly from one reservoir region to the next, and it would be impossible to make kr measurements in all of them. The unsteady-state approach was used to calculate the relat
... Show MoreIn the last few decades, growing interest has been shown in the development of new solar selective coatings based on transition metal nitride and/or oxinitride for solar absorbing applications. Solar thermal collectors are well thought out to be the most effective process of converting and harvesting solar radiation. In this investigation, Cu/TiON/CrO2 multilayered solar selective absorber coatings have been coated onto Al substrates using the dip-coating process followed by an annealing process at (400, 450, 500, 550, and 600 °C. The XRD analysis showed excellent crystalline quality for the prepared thin films along with enhanced surface features as proved by FESEM images, and the grains are in the range of (27–81) nm. The optical in
... Show MoreThe solar radiation plays an important role on the energy balance of the earthatmosphere, which is the main source of energy. Also the solar radiation is a main factor of all applications which use a solar energy as renewable energy source. The purpose of this research is to study the monthly average changes for solar radiation for the period from 1985 to 1989 by using satellite Antenna Alignment from (NASA). The result shows that the monthly average radiation changes from one year to another because of the changing of it component of atmosphere, (gases, clouds and Aerosols) and as an enhancement for this conclusion, we compared the results with the monthly average radiation at clear atmosphere where the change was slig
... Show More This research introduced the derivation of mathematical equations to calculate the Cartesian and geographical coordinates of a site situated at a far distance from the observer position by using GPS data. The geographical coordinates (ϕobs., λ obs., hobs.) for observer position were transformed to Cartesian coordinates (X obs., Y obs., Z obs.) of observer position itself. Then the Cartesian coordinates of unknown position mathematically were calculated from these calculated equations, and its transformed to geographical coordinates of (ϕunk., λunk.) position.
This study aims to formulate an alternative solution for Formalin for preserving fish as study specimens for long periods. The main reason for finding a solution instead of formalin is to get rid of the negative effects of this solution on those who work with it, as well as to better preserve the bodies of fish. Hence, three new solutions were proposed to replace formalin. Thus, Formalin, in turn, may enter the composition of a small part of these solutions to give better results and for long periods of keeping specimens. All solutions prepared in this study participated in being acidic as in formalin. Two solutions succeeded in compensating for the use of formalin in preserving fish
The current study was carried out at the Fields belongs of Horticulture Department, Collage of Agricultural Engineering Science, University of Baghdad, Al-Jadiriyah for the spring season 2016 -2017 to study the effect for inoculation mycorrhizae and folair application with bio stimulators and their interaction in the growth characters of (local okra ptera). A factorial experiment (2 in randomized complete block design (RCBD), the experiment included (12) treatment Distributed in three replicates. The three factors used in this experiment included . The inoculation with control (C) Mycorrhizae ( M ) , Biozyme (B ) ( B1 2cm3.L-1), ( B2 4cm1-.L-1) , Phosphalas (P) (P 2cm3.L-1), ( M + B1), ( M + B2), (P +
... Show More