Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
E-Learning packages are content and instructional methods delivered on a computer
(whether on the Internet, or an intranet), and designed to build knowledge and skills related to
individual or organizational goals. This definition addresses: The what: Training delivered
in digital form. The how: By content and instructional methods, to help learn the content.
The why: Improve organizational performance by building job-relevant knowledge and
skills in workers.
This paper has been designed and implemented a learning package for Prolog Programming
Language. This is done by using Visual Basic.Net programming language 2010 in
conjunction with the Microsoft Office Access 2007. Also this package introduces several
fac
Some of the main challenges in developing an effective network-based intrusion detection system (IDS) include analyzing large network traffic volumes and realizing the decision boundaries between normal and abnormal behaviors. Deploying feature selection together with efficient classifiers in the detection system can overcome these problems. Feature selection finds the most relevant features, thus reduces the dimensionality and complexity to analyze the network traffic. Moreover, using the most relevant features to build the predictive model, reduces the complexity of the developed model, thus reducing the building classifier model time and consequently improves the detection performance. In this study, two different sets of select
... Show MoreUnconfined compressive strength (UCS) of rock is the most critical geomechanical property widely used as input parameters for designing fractures, analyzing wellbore stability, drilling programming and carrying out various petroleum engineering projects. The USC regulates rock deformation by measuring its strength and load-bearing capacity. The determination of UCS in the laboratory is a time-consuming and costly process. The current study aims to develop empirical equations to predict UCS using regression analysis by JMP software for the Khasib Formation in the Buzurgan oil fields, in southeastern Iraq using well-log data. The proposed equation accuracy was tested using the coefficient of determination (R²), the average absolute
... Show MoreThis study aims at identifying the role played by Public Relations in the field of security awareness of the dangers of terrorism. The research is directed to the employees at the Directorate General of Public Relations and Media at the Ministry of Interior. And that on the basis that those who play an important role in the security awareness are the security institutions, primarily the Ministry of Interior, since this Directorate is responsible for all subjects related to the public security using public relations science. It aims at identifying the functions, methods and communication tools used by the Directorate to raise awareness about the dangers of terrorism. In order to achieve the research objectives, the researcher uses the sur
... Show MoreChemical pollution is a very important issue that people suffer from and it often affects the nature of health of society and the future of the health of future generations. Consequently, it must be considered in order to discover suitable models and find descriptions to predict the performance of it in the forthcoming years. Chemical pollution data in Iraq take a great scope and manifold sources and kinds, which brands it as Big Data that need to be studied using novel statistical methods. The research object on using Proposed Nonparametric Procedure NP Method to develop an (OCMT) test procedure to estimate parameters of linear regression model with large size of data (Big Data) which comprises many indicators associated with chemi
... Show MoreSurvival analysis is one of the types of data analysis that describes the time period until the occurrence of an event of interest such as death or other events of importance in determining what will happen to the phenomenon studied. There may be more than one endpoint for the event, in which case it is called Competing risks. The purpose of this research is to apply the dynamic approach in the analysis of discrete survival time in order to estimate the effect of covariates over time, as well as modeling the nonlinear relationship between the covariates and the discrete hazard function through the use of the multinomial logistic model and the multivariate Cox model. For the purpose of conducting the estimation process for both the discrete
... Show MoreHalf of the oil production of the worldwide is a result of the water flooding project. But the main concern of this process is mobility control of the injected fluid, because the unfavorable mobility ratio leads to fingering effect. Adding polymer to the injection water increase the water viscosity, therefore, the displacement will be more stable and have a greater sweep efficiency.
Using of polymer flooding has received more attention these days. Polymer has great potential in the Middle East region, especially in reservoir with high temperature and salinity.
The main objective of this work is to show the effect of shear rate, salinity, temperature, polymer concentration on polymer v
... Show MorePolypyrrole (PPy) nanocomposites were prepared using chemical oxidation and were combined with manganese oxide (MnO2) nanoparticles. The PPY-MnO2 nanocomposite was synthesized by integrating PPy nanofibers with varying volume ratio percentages of MnO2 dopant (10, 30, and 50% vol. ratio). The structural features of the PPy and PPy-MnO2 nanocomposite were investigated using X-ray diffraction (XRD). Fourier transfor infrared (FTIR) spectroscopy was used to demonstrate the molecular structures of primary materials and the final product of PPy, MnO2, and PPy- MnO2 nanocomposites. Field Emission Scanning Electron Microscopy (FESEM) showed that the morphology of PPy consisted of a network of nanofibers. Increasing the volume ratios of ma
... Show More