Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Abstract
The study aims to build a training program based on the Connectivism Theory to develop e-learning competencies for Islamic education teachers in the Governorate of Dhofar, as well as to identify its effectiveness. The study sample consisted of (30) Islamic education teachers to implement the training program, they were randomly selected. The study used the descriptive approach to determine the electronic competencies and build the training program, and the quasi-experimental approach to determine the effectiveness of the program. The study tools were the cognitive achievement test and the observation card, which were applied before and after. The study found that the effectiveness of the training program
... Show MoreFacing the Iraqi economy, a number of economic challenges that threaten the future of Iraq and the security of economic, political and social, such as poverty, unemployment, inflation and the dilapidated infrastructure and rising production costs and administrative and financial corruption, environmental pollution, water problems and the deterioration of agricultural and industrial production, etc., and over the seriousness of these challenges, they are intertwined and overlapping and growing worse, without the corresponding adoption of state strategies that will develop appropriate solutions and appropriate to resolve those challenges because of concern the subject of security and terrorism, which requires the development of an
... Show MoreThis investigation presents an experimental and analytical study on the behavior of reinforced concrete deep beams before and after repair. The original beams were first loaded under two points load up to failure, then, repaired by epoxy resin and tested again. Three of the test beams contains shear reinforcement and the other two beams have no shear reinforcement. The main variable in these beams was the percentage of longitudinal steel reinforcement (0, 0.707, 1.061, and 1.414%). The main objective of this research is to investigate the possibility of restoring the full load carrying capacity of the reinforced concrete deep beam with and without shear reinforcement by using epoxy resin as the material of repair. All be
... Show MoreDelays and disruption are a common issue in both community and personal building programs The problem exists all throughout the world, but it is particularly prevalent in Iraq, where millions of dollars are squandered each time as a outcome. Delays and interruptions may have serious consequences not just for Iraq's construction plans, but also for the country's economic and social status. While numerous studies have been conducted to investigate the factors driving delays and disruption in Iraqi construction projects, slight consideration has been given to by what means project management implements and approaches have affected the occurrence of project delays and disruption. After analyzing the crucial reasons for delays and instability in
... Show MoreThe study showed that there are (28) plant families present in Al-Razzaza Lake. The families are (Amaranthaceae, Amaryllidaceae, Aizoaceae, Apiaceae, Apocynaceae, Asteraceae, Brassicaceae, Boraginaceae, Capparaceae, Caryophyllaceae, Cistaceae, Colchicaceae, Convolvulaceae, Cynomoriaceae, Fabaceae, Frankeniaceae, Lamiaceae, Liliaceae, Malvaceae, Orobanchaceae, Plantaginaceae, Poaceae, Polygonaceae, Ranunculaceae, Solanaceae, Tamaricaceae,Typhaceae, Zygophyllaceae). Asteraceae family is the largest number of species found in abundance in this lake, followed by the Fabaceae family.
Metal-organic frameworks (MOFs) have emerged as revolutionary materials for developing advanced biosensors, especially for detecting reactive oxygen species (ROS) and hydrogen peroxide (H₂O₂) in biomedical applications. This comprehensive review explores the current state-of-the-art in MOF-based biosensors, covering fundamental principles, design strategies, performance features, and clinical uses. MOFs offer unique benefits, including exceptional porosity (up to 10,400 m²/g), tunable structures, biocompatibility, and natural enzyme-mimicking properties, making them ideal platforms for sensitive and selective detection of ROS and H₂O₂. Recent advances have shown significant improvements in detection capabilities, with limit
... Show MoreTwitter data analysis is an emerging field of research that utilizes data collected from Twitter to address many issues such as disaster response, sentiment analysis, and demographic studies. The success of data analysis relies on collecting accurate and representative data of the studied group or phenomena to get the best results. Various twitter analysis applications rely on collecting the locations of the users sending the tweets, but this information is not always available. There are several attempts at estimating location based aspects of a tweet. However, there is a lack of attempts on investigating the data collection methods that are focused on location. In this paper, we investigate the two methods for obtaining location-based dat
... Show More