Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
The support vector machine, also known as SVM, is a type of supervised learning model that can be used for classification or regression depending on the datasets. SVM is used to classify data points by determining the best hyperplane between two or more groups. Working with enormous datasets, on the other hand, might result in a variety of issues, including inefficient accuracy and time-consuming. SVM was updated in this research by applying some non-linear kernel transformations, which are: linear, polynomial, radial basis, and multi-layer kernels. The non-linear SVM classification model was illustrated and summarized in an algorithm using kernel tricks. The proposed method was examined using three simulation datasets with different sample
... Show MoreInfrared photoconductive detectors working in the far-infrared region and room temperature were fabricated. The detectors were fabricated using three types of carbon nanotubes (CNTs); MWCNTs, COOH-MWCNTs, and short-MWCNTs. The carbon nontubes suspension is deposited by dip coating and drop–casting techniques to prepare thin films of CNTs. These films were deposited on porous silicon (PSi) substrates of n-type Si. The I-V characteristics and the figures of merit of the fabricated detectors were measured at a forward bias voltage of 3 and 5 volts as well as at dark and under illumination by IR radiation from a CO2 laser of 10.6 μm wavelengths and power of 2.2 W. The responsivity and figures of merit of the photoconductive detector
... Show MoreBackground/Objectives: The purpose of this study was to classify Alzheimer’s disease (AD) patients from Normal Control (NC) patients using Magnetic Resonance Imaging (MRI). Methods/Statistical analysis: The performance evolution is carried out for 346 MR images from Alzheimer's Neuroimaging Initiative (ADNI) dataset. The classifier Deep Belief Network (DBN) is used for the function of classification. The network is trained using a sample training set, and the weights produced are then used to check the system's recognition capability. Findings: As a result, this paper presented a novel method of automated classification system for AD determination. The suggested method offers good performance of the experiments carried out show that the
... Show MoreChallenges facing the transition of traditional cities to smart: Studying the challenges faced by the transition of a traditional area such as Al-Kadhimiya city center to the smart style
This review paper examines the crucial impact of YouTube on learning English as a Foreign Language. Recently, learners’ interaction and development of their skills have been improved due to the integration of digital platforms into language education. YouTube is regarded as one of the most prevalent platforms due to its accessibility, multimodal content, and capacity to simulate real-life communication. This study tackles thirty selected research articles from various cultural and institutional backgrounds to identify the pedagogical benefits and challenges associated with using YouTube in teaching English. Conventional methods of teaching English as a foreign language encounter difficulties in improving students’ engagement and
... Show MoreThe place in which the person lives and his geographical and social environment have a great impact on building his personality, belief and culture, Islam has alerted the importance of the Muslim to make sure to choose the appropriate place in which he resides and dwells in that it is compatible with his religion and belief in order to ensure communication with Islamic knowledge in a way that enhances his belief Arabization occurs when a person makes himself an Arab by living the life of the Bedouins, and creates the morals of the Bedouins from the inhabitants of the Badia with its harshness, cruelty, ignorance and lack of understanding in religion and far from the sources of knowledge of Islamic knowledge. Blasphemy and polytheism, and
... Show MoreA skip list data structure is really just a simulation of a binary search tree. Skip lists algorithm are simpler, faster and use less space. this data structure conceptually uses parallel sorted linked lists. Searching in a skip list is more difficult than searching in a regular sorted linked list. Because a skip list is a two dimensional data structure, it is implemented using a two dimensional network of nodes with four pointers. the implementation of the search, insert and delete operation taking a time of upto . The skip list could be modified to implement the order statistic operations of RANKand SEARCH BY RANK while maintaining the same expected time. Keywords:skip list , parallel linked list , randomized algorithm , rank.
Mixed-effects conditional logistic regression is evidently more effective in the study of qualitative differences in longitudinal pollution data as well as their implications on heterogeneous subgroups. This study seeks that conditional logistic regression is a robust evaluation method for environmental studies, thru the analysis of environment pollution as a function of oil production and environmental factors. Consequently, it has been established theoretically that the primary objective of model selection in this research is to identify the candidate model that is optimal for the conditional design. The candidate model should achieve generalizability, goodness-of-fit, parsimony and establish equilibrium between bias and variab
... Show More