Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
To expedite the learning process, a group of algorithms known as parallel machine learning algorithmscan be executed simultaneously on several computers or processors. As data grows in both size andcomplexity, and as businesses seek efficient ways to mine that data for insights, algorithms like thesewill become increasingly crucial. Data parallelism, model parallelism, and hybrid techniques are justsome of the methods described in this article for speeding up machine learning algorithms. We alsocover the benefits and threats associated with parallel machine learning, such as data splitting,communication, and scalability. We compare how well various methods perform on a variety ofmachine learning tasks and datasets, and we talk abo
... Show MoreThe achievements of the art that we know today are questioned in motives that differ from what art knew before, including dramatic artistic transformations, which he called modern art.
In view of the enormity of such a topic, its ramifications and its complexity, it was necessary to confine its subject to the origin of the motives of the transformations of its first pioneers, and then to stand on what resulted from that of the data of vision in composition and drawing exclusively, and through exploration in that, we got to know the vitality of change from the art of its time.
And by examining the ruling contemporary philosophical concepts and their new standards and their epistemological role in contemporary life, since they includ
This paper presents a hybrid approach for solving null values problem; it hybridizes rough set theory with intelligent swarm algorithm. The proposed approach is a supervised learning model. A large set of complete data called learning data is used to find the decision rule sets that then have been used in solving the incomplete data problem. The intelligent swarm algorithm is used for feature selection which represents bees algorithm as heuristic search algorithm combined with rough set theory as evaluation function. Also another feature selection algorithm called ID3 is presented, it works as statistical algorithm instead of intelligent algorithm. A comparison between those two approaches is made in their performance for null values estima
... Show MoreThis paper describes the use of microcomputer as a laboratory instrument system. The system is focused on three weather variables measurement, are temperature, wind speed, and wind direction. This instrument is a type of data acquisition system; in this paper we deal with the design and implementation of data acquisition system based on personal computer (Pentium) using Industry Standard Architecture (ISA)bus. The design of this system involves mainly a hardware implementation, and the software programs that are used for testing, measuring and control. The system can be used to display the required information that can be transferred and processed from the external field to the system. A visual basic language with Microsoft foundation cl
... Show MoreA total of 533 specimens were collected in survey of Brachyceran species from different regions of Iraq during February to November 2014 .This study was reported 16 species belonging to 13 genera and 7 families, the results showed that Dicranosepsis Duda, 1926 (Family; Sepsidae) is recorded the genus for the first time in Iraq.
Sixty samples of commercially available contact lens solutions were collected from students at the Pharmacy College/Baghdad University. The types of lenses used varied from medical to cosmetic. They were cultured to diagnose any microbial contamination within the solutions. Both used and unused solutions were subject for culturing. Thirty six (60%) used samples showed bacterial growth, fungal growth was absent. Pseudomonas aeruginosa accounts for the highest number of isolates (25%) followed by E. coli (21%), Staphylococcus epidermidis (6.6%), Pseudomonas fluorescence (5%) and Proteus mirabilis (1.6%) respectively. Only one (1) unused (sealed) sample showed growth of P. fluorescence.
... Show MoreThis research aims to test the ability of glass waste powder to adsorb cadmium from aqueous solutions. The glass wastes were collected from the Glass Manufacturing Factory in Ramadi. The effect of concentration and reaction time on sorption was tested through a series of laboratory experiments. Four Cd concentrations (20, 40, 60, and 80) as each concentration was tested ten times for 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 min. Solid (glass wastes) to liquid was 2g to 30ml was fixed in each experiment where the total volume of the solution was 30ml. The pH, total dissolved salts and electrical conductivity were measured at 30ºC. The equilibrium concentration was determined at 25 minutes, thereafter it was noted that the sorption
... Show More