Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Vol. 6, Issue 1 (2025)
One of the most serious health disasters in recent memory is the COVID-19 epidemic. Several restriction rules have been forced to reduce the virus spreading. Masks that are properly fitted can help prevent the virus from spreading from the person wearing the mask to others. Masks alone will not protect against COVID-19; they must be used in conjunction with physical separation and avoidance of direct contact. The fast spread of this disease, as well as the growing usage of prevention methods, underscore the critical need for a shift in biometrics-based authentication schemes. Biometrics systems are affected differently depending on whether are used as one of the preventive techniques based on COVID-19 pandemic rules. This study provides an
... Show MoreIn wide range of chemical, petrochemical and energy processes, it is not possible to manage without slurry bubble column reactors. In this investigation, time average local gas holdup was recorded for three different height to diameter (H/D) ratios 3, 4 and 5 in 18" diameter slurry bubble column. Air-water-glass beads system was used with superficial velocity up to 0.24 m/s. the gas holdup was measured using 4-tips optical fiber probe technique. The results show that the axial gas holdup increases almost linearly with the superficial gas velocity in 0.08 m/s and levels off with a further increase of velocity. A comparison of the present data with those reported for other slurry bubble column having diameters larger than
... Show More
This research aims to understand complexity management and its impact on the use of the dynamic capabilities of a sample of private colleges. Private colleges are currently facing many crises, changes, unrest and high competitive pressures. Which is sometimes difficult or even impossible to predict. The recruitment of dynamic capabilities is also one of the challenges facing senior management at private colleges to help them survive and survive. Thus, the problem of research was (there is a clear insufficiency of interest in Complexity Management and trying to employ it in improving the dynamic capabilities of Colleges that have been discussed?). A group of private colleges was selected as a
... Show MoreImage classification is the process of finding common features in images from various classes and applying them to categorize and label them. The main problem of the image classification process is the abundance of images, the high complexity of the data, and the shortage of labeled data, presenting the key obstacles in image classification. The cornerstone of image classification is evaluating the convolutional features retrieved from deep learning models and training them with machine learning classifiers. This study proposes a new approach of “hybrid learning” by combining deep learning with machine learning for image classification based on convolutional feature extraction using the VGG-16 deep learning model and seven class
... Show MoreImage classification is the process of finding common features in images from various classes and applying them to categorize and label them. The main problem of the image classification process is the abundance of images, the high complexity of the data, and the shortage of labeled data, presenting the key obstacles in image classification. The cornerstone of image classification is evaluating the convolutional features retrieved from deep learning models and training them with machine learning classifiers. This study proposes a new approach of “hybrid learning” by combining deep learning with machine learning for image classification based on convolutional feature extraction using the VGG-16 deep learning model and seven class
... Show MoreThis study aimed to investigate the role of Big Data in forecasting corporate bankruptcy and that is through a field analysis in the Saudi business environment, to test that relationship. The study found: that Big Data is a recently used variable in the business context and has multiple accounting effects and benefits. Among the benefits is forecasting and disclosing corporate financial failures and bankruptcies, which is based on three main elements for reporting and disclosing that, these elements are the firms’ internal control system, the external auditing, and financial analysts' forecasts. The study recommends: Since the greatest risk of Big Data is the slow adaptation of accountants and auditors to these technologies, wh
... Show MoreThe intelligent buildings provided various incentives to get highly inefficient energy-saving caused by the non-stationary building environments. In the presence of such dynamic excitation with higher levels of nonlinearity and coupling effect of temperature and humidity, the HVAC system transitions from underdamped to overdamped indoor conditions. This led to the promotion of highly inefficient energy use and fluctuating indoor thermal comfort. To address these concerns, this study develops a novel framework based on deep clustering of lagrangian trajectories for multi-task learning (DCLTML) and adding a pre-cooling coil in the air handling unit (AHU) to alleviate a coupling issue. The proposed DCLTML exhibits great overall control and is
... Show More