Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Background: Toxin-producing Shiga Escherichia coli has been identified as a new foodborne pathogen that poses a significant health risk to humans. Shiga toxin-producing Escherichia coli can be found in raw cow milk and its derivatives. A small number of Escherichia coli strains that produce shiga toxin are pathogenic. Aim of study: The study aimed to see if there were any virulence genes in 50 milk samples that were typical of Entero-haemorrhagic E. coli and evaluate the Myrtus communis effects on these bacteria. Materials and Method: Milk samples were used to isolate E. coli bacteria (n= 27), biochemically analyzed, and genetically screened for virulence genes using a multiplex (PCR). The hydro-alcoholic extraction of Myrtus communis leave
... Show MoreAllosteric inhibition of EGFR tyrosine kinase (TK) is currently among the most attractive approaches for designing and developing anti-cancer drugs to avoid chemoresistance exhibited by clinically approved ATP-competitive inhibitors. The current work aimed to synthesize new biphenyl-containing derivatives that were predicted to act as EGFR TK allosteric site inhibitors based on molecular docking studies.
A new series of 4'-hydroxybiphenyl-4-carboxylic acid derivatives, including hydrazine-1-carbothioamide (S3-S6) and 1,2,4-triazole (S7-S10) derivatives, were synthesized and characterized using IR, 1HNMR, 13CNMR
The objective of present study was to compare of several methods for estimating the degree of heritability and calculating the number of genes using generation mean analysis of maize (
The aim of the present study is to provide the adequate knowledge about the role of time management in facilitate the work requirements for employees of the administrative department at the Ministry of Higher Education and Scientific Research. The research depend on studying four important dimensions which are (time planning, time organization, time direction and time observation). In addition to study other five dimensions which are (new procedures, clear procedures, short procedures, the available information and the simplicity of the methods
used).Questionnaire sheets consist of (38 questions) distributed to (170) employees and (146) sheets only were considered in the study. SPSS program was used
... Show MoreThe research aims to form a clear theoretical philosophy and perceptions about strategic Entrepreneurship through the relationship between high Involvement management practices, the basis in creating that leadership and high-performance work systems as a support tool in achieving them according to the proposals (Hitt et al, 2011), in an attempt to generalize theoretical philosophy and put forward how to apply it within The Iraqi environment, and on this basis the problem of the current research was launched to bridge the knowledge gap between the previous proposals and the possibility of their application, aiming to identify the practices of high Involvement management and the possibility of high-performance work systems and thei
... Show MoreThe research aims to prepare preventive exercises in the boot camp style to enhance the efficiency of the ankle joint and reduce its injuries for young triple jump players, and to determine the effect of preventive exercises on improving the efficiency of the ankle joint. The researchers assumed statistically significant differences between the pre-and posttests in the research variables. The experimental approach was adopted to suit it, and the research sample was chosen from young triple jump players. The preventive approach prepared by the researchers was applied to the sample, and it included preventive exercises in the boot camp style with and without tools. The researchers concluded that preventive exercises in a boot camp style have
... Show MoreThe concept of TQM is based on one of the concepts that combine administrative and innovative methods. The aim of the research is to demonstrate the dimensions of TQM in enhancing the satisfaction of the taxpayers through a survey of a sample of officials in the General Authority for Taxation and 50 officials. In the collection of data and information, and the results were analyzed using the SPSS program to find the most important compounds and factors in he method of analysis.
The research problem was represented by the non-application of the General Authority for Taxation to the entrances and modern practices in the administrative work. The results of some of the complications that accompany the tax accounting process, which af
... Show More
Public relations are amongst the social sciences that rely on scientific methods in achieving new knowledge or resolving existing problems by means of its scientific researches that are often applied and require a classification in terms of their results’ analysis. It also requires subtle statistical processes whether in constructing their material or in analyzing and interpreting their results.
This research seeks to identify the relation between public relations and statistics, and the significance a researcher or practitioner in the domain of public relations should assign to statistics being one of the important criteria in identifying the accuracy and object
... Show More