Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Abstract
The grey system model GM(1,1) is the model of the prediction of the time series and the basis of the grey theory. This research presents the methods for estimating parameters of the grey model GM(1,1) is the accumulative method (ACC), the exponential method (EXP), modified exponential method (Mod EXP) and the Particle Swarm Optimization method (PSO). These methods were compared based on the Mean square error (MSE) and the Mean Absolute percentage error (MAPE) as a basis comparator and the simulation method was adopted for the best of the four methods, The best method was obtained and then applied to real data. This data represents the consumption rate of two types of oils a he
... Show MoreBackground: This study aimed to examine the efficacy of methylene blue (MB) and toluidine blue O (TBO) photodynamic therapy (PDT) as adjuncts to root surface debridement (RSD). Methods: This split-mouth, randomized, controlled clinical trial included eighteen patients, and a total of 332 sites (control = 102, MB = 124 and TBO = 106) were examined. Two sessions of PDT were completed at baseline and two weeks after RSD. Clinical parameters of bleeding on probing (BOP), plaque index (PI), probing pocket depth (PPD), and clinical attachment level (CAL) were measured pre- and post-treatment. Results: PPD and BOP reductions in sites treated by RSD with adjunctive photosensitizers (MB and TBO) were significantly higher than in control site
... Show MoreIntroduction/Aim. Seminal fluid analysis (SFA) plays a crucial role in helping infertility clinics diagnose the underlying reason of male infertility. The aim of the study was to investigate seminal fluid patterns of male partners of an infertile couple with apparently fertile female partners. Materials and methods. A cross-sectional study was conducted between January 2019 and December 2022. Patients were attending consultations for delayed conception for more than 12 months with apparently fertile female partner. Results. Four hundred fifty-three patients were included in the study. The distribution of patients according to age groups showed that 277 patients were young, aged 21 - 30 years (61%). Two hundred sixty-two (58%) patien
... Show More
The properties of capturing of peristaltic flow to a chemically reacting couple stress fluid through an inclined asymmetric channel with variable viscosity and various boundaries are investigated. we have addressed the impacts of variable viscosity, different wave forms, porous medium, heat and mass transfer for peristaltic transport of hydro magnetic couple stress liquid in inclined asymmetric channel with different boundaries. Moreover, The Fluid viscosity assumed to vary as an exponential function of temperature. Effects of almost flow parameters are studied analytically and computed. An rising in the temperature and concentration profiles return to heat and mass transfer Biot numbers. Noteworthy, the Soret and Dufour number effect resul
... Show MoreMacrocheles glaber (Müller) is one of several mites that feeds on eggs, newly hatched &
small larvae of house fly Musca domestica L. This mite was reared in the laboratory on house
fly frozen eggs at constant conditions of 28°C±1 and 90% relative humidity using sterilized
horse dung substrate. The predation rate of adult female and male on frozen eggs was (18, 3)
eggs/mite/day respectively, the number of frozen eggs destroyed by adult female through its
life was 185.6 eggs.
The mean duration of adult female from egg to adult stage was 2.67 days, the longevity of
female was 27.8 days, the mean daily egg production was 2.7 egg with total egg productivity
of 72.1 egg.
Celiac disease (CD) is an inflammatory small intestinal disorder that can lead to severe villous atrophy, and malabsorption . Since the measurement of α-amylase activity is the most widely used biochemical test for the diagnosis of pancreatic and non pancreatic disease , therefore serum α-amylase were studied in the present study in an attempt to evaluate the usefulness of this enzyme in the diagnosis of celiac disease and its relationship with anti gliadin IgA and IgG and serum glucose . Thirty one patients with celiac disease were studied and compared with twenty four healthy individuals . Significant elevation of α-amylase activity , glucose and anti gliadin IgA and IgG were observed in the sera of patients with celiac diseas
... Show MoreOne of most the important compounds which have active hydrogen (substrate) is the thiols which used in a wide field in preparation of Mannich bases . A large number of Mannich bases have been prepared as a biologically active compound (pharmaceutical, pesticides, bactericidal, fungicidal and tuberculostatic) and in order to correlate their structure and reactivity with their pharmacological activity such as . It has been reported that the reaction is easily proceeded by using primary and secondary amine beside formaldehyde. But when we tried the reaction of thiols as substrate and formaldehyde and succinimide instead of amine, the reaction did not proceed to give Mannich base but product were methylenene – bis – sulfide . Mann
... Show MoreBackground: Visfatin is a novel adipokine that mainly secreted by visceral adipose tissue, had an important role in inflammation and immune system. Creatine Kinase (CK) which is an enzyme that is involved in energy metabolism, found in large amounts in myocardium, brain and skeletal tissues. This study is carried out To evaluate the periodontal health status of the study groups (chronic periodontitis and chronic periodontitis with coronary atherosclerosis) and control groups, to measure the salivary levels of visfatin and Creatine Kinase in these groups and compare between them, and to determine the correlations between salivary visfatin and Creatine Kinase levels with the periodontal parameters in the three groups. Materials and Methods: e
... Show More