Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Abstract
The study aimed to prepare a practical guide for procedures for auditing the strategies of municipal institutions in achieving sustainable development by adopting the idea of the audit matrix through which a classified report is prepared according to the dimensions of sustainable development, by preparing a specialized audit program for the purpose of auditing strategies for achieving sustainable development and emptying the results of the application of each of the paragraphs The program in the audit matrix that was prepared for the purpose of determining the impact of each observation and linkin
... Show MoreIn addition to the primary treatment, biological treatment is used to reduce inorganic and organic components in the wastewater. The separation of biomass from treated wastewater is usually important to meet the effluent disposal requirements, so the MBBR system has been one of the most important modern technologies that use plastic tankers to transport biomass with wastewater, which works in pure biofilm, at low concentrations of suspended solids. However, biological treatment has been developed using the active sludge mixing process with MBBR. Turbo4bio was established as a sustainable and cost-effective solution for wastewater treatment plants in the early 1990s and ran on minimal sludge, and is easy to maintain. This
... Show MoreNumerical investigation has been carried out on heat transfer and friction factor characteristics of copper-water nanofluid flow in a constant heat-fluxed tube with the existence of new configuration of vortex generator using Computational Fluid Dynamics (CFD) simulation. Two types of swirl flow generator: Classical twisted tape (CTT) and Parabolic-cut twisted tape (PCT) with a different twist ratio (= 2.93, 3.91 and 4.89) and different cut depth (= 0.5, 1.0 and 1.5 cm) with 2% and 4% volume concentration
... Show MoreThis paper presents ABAQUS simulations of fully encased composite columns, aiming to examine the behavior of a composite column system under different load conditions, namely concentric, eccentric with 25 mm eccentricity, and flexural loading. The numerical results are validated with the experimental results obtained for columns subjected to static loads. A new loading condition with a 50 mm eccentricity is simulated to obtain additional data points for constructing the interaction diagram of load-moment curves, in an attempt to investigate the load-moment behavior for a reference column with a steel I-section and a column with a GFRP I-section. The result comparison shows that the experimental data align closely with the simulation
... Show MoreControlling public expenditures is one of the main objectives of the public budget. The public budget often suffers from a deficit, whether in developed or developing countries, because expenditures are usually greater than the revenues generated. This requires the existence of financial rules that are adhered to by the government, which in turn leads to discipline. Fiscal policy leads to a reduction in the obligations incumbent on the government. Adhering to the financial rules would correct the course of fiscal policy in Iraq, with the need to direct oil revenues in the years of financial abundance when global oil prices rise to sovereign funds similar to other rentier countries, which contributes to maintaining the stabi
... Show More