Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Text categorization refers to the process of grouping text or documents into classes or categories according to their content. Text categorization process consists of three phases which are: preprocessing, feature extraction and classification. In comparison to the English language, just few studies have been done to categorize and classify the Arabic language. For a variety of applications, such as text classification and clustering, Arabic text representation is a difficult task because Arabic language is noted for its richness, diversity, and complicated morphology. This paper presents a comprehensive analysis and a comparison for researchers in the last five years based on the dataset, year, algorithms and the accuracy th
... Show MoreDistributed Denial of Service (DDoS) attacks on Web-based services have grown in both number and sophistication with the rise of advanced wireless technology and modern computing paradigms. Detecting these attacks in the sea of communication packets is very important. There were a lot of DDoS attacks that were directed at the network and transport layers at first. During the past few years, attackers have changed their strategies to try to get into the application layer. The application layer attacks could be more harmful and stealthier because the attack traffic and the normal traffic flows cannot be told apart. Distributed attacks are hard to fight because they can affect real computing resources as well as network bandwidth. DDoS attacks
... Show MoreArtificial intelligence techniques are reaching us in several forms, some of which are useful but can be exploited in a way that harms us. One of these forms is called deepfakes. Deepfakes is used to completely modify video (or image) content to display something that was not in it originally. The danger of deepfake technology impact on society through the loss of confidence in everything is published. Therefore, in this paper, we focus on deepfakedetection technology from the view of two concepts which are deep learning and forensic tools. The purpose of this survey is to give the reader a deeper overview of i) the environment of deepfake creation and detection, ii) how deep learning and forensic tools contributed to the detection
... Show MoreThyroid disease is a common disease affecting millions worldwide. Early diagnosis and treatment of thyroid disease can help prevent more serious complications and improve long-term health outcomes. However, thyroid disease diagnosis can be challenging due to its variable symptoms and limited diagnostic tests. By processing enormous amounts of data and seeing trends that may not be immediately evident to human doctors, Machine Learning (ML) algorithms may be capable of increasing the accuracy with which thyroid disease is diagnosed. This study seeks to discover the most recent ML-based and data-driven developments and strategies for diagnosing thyroid disease while considering the challenges associated with imbalanced data in thyroid dise
... Show MoreSemantic segmentation realization and understanding is a stringent task not just for computer vision but also in the researches of the sciences of earth, semantic segmentation decompose compound architectures in one elements, the most mutual object in a civil outside or inside senses must classified then reinforced with information meaning of all object, it’s a method for labeling and clustering point cloud automatically. Three dimensions natural scenes classification need a point cloud dataset to representation data format as input, many challenge appeared with working of 3d data like: little number, resolution and accurate of three Dimensional dataset . Deep learning now is the po
The need to constantly and consistently improve the quality and quantity of the educational system is essential. E-learning has emerged from the rapid cycle of change and the expansion of new technologies. Advances in information technology have increased network bandwidth, data access speed, and reduced data storage costs. In recent years, the implementation of cloud computing in educational settings has garnered the interest of major companies, leading to substantial investments in this area. Cloud computing improves engineering education by providing an environment that can be accessed from anywhere and allowing access to educational resources on demand. Cloud computing is a term used to describe the provision of hosting services
... Show MoreIn this paper, we have investigated some of the most recent energy efficient routing protocols for wireless body area networks. This technology has seen advancements in recent times where wireless sensors are injected in the human body to sense and measure body parameters like temperature, heartbeat and glucose level. These tiny wireless sensors gather body data information and send it over a wireless network to the base station. The data measurements are examined by the doctor or physician and the suitable cure is suggested. The whole communication is done through routing protocols in a network environment. Routing protocol consumes energy while helping non-stop communic
... Show MoreVascular patterns were seen to be a probable identification characteristic of the biometric system. Since then, many studies have investigated and proposed different techniques which exploited this feature and used it for the identification and verification purposes. The conventional biometric features like the iris, fingerprints and face recognition have been thoroughly investigated, however, during the past few years, finger vein patterns have been recognized as a reliable biometric feature. This study discusses the application of the vein biometric system. Though the vein pattern can be a very appealing topic of research, there are many challenges in this field and some improvements need to be carried out. Here, the researchers reviewed
... Show MoreThe convolutional neural networks (CNN) are among the most utilized neural networks in various applications, including deep learning. In recent years, the continuing extension of CNN into increasingly complicated domains has made its training process more difficult. Thus, researchers adopted optimized hybrid algorithms to address this problem. In this work, a novel chaotic black hole algorithm-based approach was created for the training of CNN to optimize its performance via avoidance of entrapment in the local minima. The logistic chaotic map was used to initialize the population instead of using the uniform distribution. The proposed training algorithm was developed based on a specific benchmark problem for optical character recog
... Show MoreDetection of early clinical keratoconus (KCN) is a challenging task, even for expert clinicians. In this study, we propose a deep learning (DL) model to address this challenge. We first used Xception and InceptionResNetV2 DL architectures to extract features from three different corneal maps collected from 1371 eyes examined in an eye clinic in Egypt. We then fused features using Xception and InceptionResNetV2 to detect subclinical forms of KCN more accurately and robustly. We obtained an area under the receiver operating characteristic curves (AUC) of 0.99 and an accuracy range of 97–100% to distinguish normal eyes from eyes with subclinical and established KCN. We further validated the model based on an independent dataset with
... Show More