Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Human health was and still the most important problem and objective of all most researches. Finding out what causes in the decadence of healthiness of Iraqi population is our tendency in the present work, Uranium causing cancer that is affected by a correlation between age and gender of bladder cancer patients is studied in the present work. Mean of Uranium concentration (Uc) decreased with increasing age for all age group without dependency on gender. While, there is a wide dispersion in Mean Uc excretion between males and females, due to the effect of correlated gender with age, where female Mean Uc is maximum at age 50-69 year (2.355 µg/L), and it's higher than male Mean Uc (2.022 µg/L) in this age stage because of menopause, a
... Show MoreThis paper presents a research for magnetohydrodynamic (MHD) flow of an incompressible generalized Burgers’ fluid including by an accelerating plate and flowing under the action of pressure gradient. Where the no – slip assumption between the wall and the fluid is no longer valid. The fractional calculus approach is introduced to establish the constitutive relationship of the generalized Burgers’ fluid. By using the discrete Laplace transform of the sequential fractional derivatives, a closed form solutions for the velocity and shear stress are obtained in terms of Fox H- function for the following two problems: (i) flow due to a constant pressure gradient, and (ii) flow due to due to a sinusoidal pressure gradient. The solutions for
... Show MoreThis paper presents experimental results regarding the behaviours of eight simply supported partially prestressed concrete beams with internally unbonded tendons, focusing particularly on the effect of three different variables: concrete compressive strength,
Infection with cryptosporidiosis endangers the lives of many people with immunodeficiency, especially HIV patients. Nitazoxanide is one of the main therapeutic drugs used to treat cryptosporidiosis. However, it is poorly soluble in water, which restricts its usefulness and efficacy in immunocompromised patients. Surfactants have an amphiphilic character which indicates their ability to improve the water solubility of the hydrophobic drugs. Our research concerns the synthesis of new cationic Gemini surfactants that have the ability to improve the solubility of the drug Nanazoxide. So, we synthesized cationic Gemini surfactants. N1,N1,N3,N3-tetramethyl-N1,N3-bis(2-octadecanamidoethyl)propane-1,3-diaminium bromide (CGSPS18) and 2,2‘-(etha
... Show Morecharge transfer complex formed by interaction between the p- aminodiphenylamine (PADPA) as electron donor with iodine as electron acceptor in ethanol at 250C as evidenced by color change and absorption. The spectrum obtained from complex PADPA – Iodine shows absorptions bands at 586 nm. All the variables which affected on the stability of complex were studies such as temperature, pH, time and concentration of acceptor. The linearity of the method was observed within a concentration rang (10–165) mg.L-1 and with a correlation coefficient (0.9996), while the molar absorbitivity and sandell sensitivity were (4643.32) L.mol-1.cm-1 and (0.0943) μg.cm-2, respectively. The adsorption of complex PADPA–I2 was studied using adsorbent surfaces
... Show MoreIn this research, we use fuzzy nonparametric methods based on some smoothing techniques, were applied to real data on the Iraqi stock market especially the data about Baghdad company for soft drinks for the year (2016) for the period (1/1/2016-31/12/2016) .A sample of (148) observations was obtained in order to construct a model of the relationship between the stock prices (Low, high, modal) and the traded value by comparing the results of the criterion (G.O.F.) for three techniques , we note that the lowest value for this criterion was for the K-Nearest Neighbor at Gaussian function .
Variable selection is an essential and necessary task in the statistical modeling field. Several studies have triedto develop and standardize the process of variable selection, but it isdifficultto do so. The first question a researcher needs to ask himself/herself what are the most significant variables that should be used to describe a given dataset’s response. In thispaper, a new method for variable selection using Gibbs sampler techniqueshas beendeveloped.First, the model is defined, and the posterior distributions for all the parameters are derived.The new variable selection methodis tested usingfour simulation datasets. The new approachiscompared with some existingtechniques: Ordinary Least Squared (OLS), Least Absolute Shrinkage
... Show More