Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Bismuth oxide nanoparticle Bi2O3NPs has a wide range of applications and less adverse effects than conventional radio sensitizers. In this work, Bi2O3NPs (D1, D2) were successfully synthesized by using the biosynthesis method with varying bismuth salts, bismuth sulfate Bi2(SO4)3 (D1) or bismuth nitrate. Penta hydrate Bi(NO3)3.5H2O (D2) with NaOH with beta-vulgaris extract. The Bi2O3NPs properties were characterized by different spectroscopic methods to determine Bi2O3NPs structure, nature of bonds, size of nanoparticle, element phase, presence, crystallinity and morphology. The existence of the Bi2O3 band was verified by the FT-IR. The Bi2O3 NPs revealed an absorption peak in the UV-visible spectrum, with energy gap Eg = 3.80eV. The X-ray p
... Show MoreThe study aims to identify the uses and the impact of social networking applications and websites on stock markets and their role in defining the details of dealing with stock movement and trading. The study also aims to highlight the role of these networks by increasing confidence in stock markets and companies as well as encouraging and inciting young people to invest in these markets, the study belongs to the descriptive analytical approach, the study population consisted of all current and potential investors in the stock and financial markets in the United Arab Emirates. The study used a questionnaire that was distributed to a number of followers of social networking pages and websites that deal with trading
... Show MoreIn this work, γ-Al2O3NPs were successfully biosynthesized, mediated aluminum nitrate nona hydrate Al(NO3)3.9H2O, sodium hydroxide, and aqueous clove extract in alkali media. The γ-Al2O3NPs were characterized by different techniques like Fourier transform infrared spectroscopy (FT-IR), UV-Vis spectroscopy, X-ray diffraction (XRD), scanning electron microscopy (SEM), energy–dispersive x-ray spectroscopy, transmission electron microscope (TEM), Energy-dispersive X-ray spectroscopy (EDX), and atomic force microscopy (AFM). The final results indicated the γ-Al2O3NPs nanoparticle size, bonds nature, element phase, crystallinity, morphology, surface image, particle analysis – threshold detection, and the topography parameter. The id
... Show MoreAbstract
This Research aims for harnessing critical and innovative thinking approaches besides innovative problem solving tools in pursuing continual quality improvement initiatives for the benefit of achieving operations results effectively in water treatment plants in Baghdad Water Authority. Case study has been used in fulfilling this research in the sadr city water treatment plant, which was chosen as a study sample as it facilitates describing and analyzing its current operational situation, collecting and analyzing its own data, in order to get its own desired improvement opportunity be done. Many statistical means and visual thinking promoting methods has been used to fulfill research task.
... Show MoreIn data mining, classification is a form of data analysis that can be used to extract models describing important data classes. Two of the well known algorithms used in data mining classification are Backpropagation Neural Network (BNN) and Naïve Bayesian (NB). This paper investigates the performance of these two classification methods using the Car Evaluation dataset. Two models were built for both algorithms and the results were compared. Our experimental results indicated that the BNN classifier yield higher accuracy as compared to the NB classifier but it is less efficient because it is time-consuming and difficult to analyze due to its black-box implementation.
Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such a
... Show More