A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Ali H. Al-Timemy

doi:10.1186/s40537-023-00727-2

Details

Publication Date

Fri Apr 14 2023

Journal Name

Journal Of Big Data

Volume

10

DOI

10.1186/s40537-023-00727-2

Choose Citation Style

Statistics

View publication

25

View pdf

1

Statistics

(661)

(664)

A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Ali H. Al-Timemy

...Show More Authors

Abstract<p>Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.</p>

View Publication Preview PDF

Quick Preview PDF

Publication Date

Wed May 10 2023

Journal Name

Diagnostics

A Deep Feature Fusion of Improved Suspected Keratoconus Detection with Deep Learning

Ali H.

Laith

Zahraa M.

Hazem

Nebras H.

Alexandru

Rossen M.

Hidenori

Yuantong

Siamak

...Show More Authors

Detection of early clinical keratoconus (KCN) is a challenging task, even for expert clinicians. In this study, we propose a deep learning (DL) model to address this challenge. We first used Xception and InceptionResNetV2 DL architectures to extract features from three different corneal maps collected from 1371 eyes examined in an eye clinic in Egypt. We then fused features using Xception and InceptionResNetV2 to detect subclinical forms of KCN more accurately and robustly. We obtained an area under the receiver operating characteristic curves (AUC) of 0.99 and an accuracy range of 97–100% to distinguish normal eyes from eyes with subclinical and established KCN. We further validated the model based on an independent dataset with

View Publication

(32)

Publication Date

Thu Jun 01 2023

Journal Name

International Journal Of Electrical And Computer Engineering (ijece)

An optimized deep learning model for optical character recognition applications

Salih S.Q.

NUHA SAMI

...Show More Authors

The convolutional neural networks (CNN) are among the most utilized neural networks in various applications, including deep learning. In recent years, the continuing extension of CNN into increasingly complicated domains has made its training process more difficult. Thus, researchers adopted optimized hybrid algorithms to address this problem. In this work, a novel chaotic black hole algorithm-based approach was created for the training of CNN to optimize its performance via avoidance of entrapment in the local minima. The logistic chaotic map was used to initialize the population instead of using the uniform distribution. The proposed training algorithm was developed based on a specific benchmark problem for optical character recog

View Publication

(4)

(2)

Publication Date

Tue Apr 30 2024

Journal Name

Iraqi Journal Of Science

Credit Card Fraud Detection Challenges and Solutions: A Review

Sumaya

Ibraheem

Sarab M.

...Show More Authors

Credit card fraud has become an increasing problem due to the growing reliance on electronic payment systems and technological advances that have improved fraud techniques. Numerous financial institutions are looking for the best ways to leverage technological advancements to provide better services to their end users, and researchers used various protection methods to provide security and privacy for credit cards. Therefore, it is necessary to identify the challenges and the proposed solutions to address them. This review provides an overview of the most recent research on the detection of fraudulent credit card transactions to protect those transactions from tampering or improper use, which includes imbalance classes, c

(14)

(16)

Publication Date

Mon Jan 01 2024

Journal Name

Bio Web Of Conferences

Forecasting Cryptocurrency Market Trends with Machine Learning and Deep Learning

Fadhil H.M.

...Show More Authors

Cryptocurrency became an important participant on the financial market as it attracts large investments and interests. With this vibrant setting, the proposed cryptocurrency price prediction tool stands as a pivotal element providing direction to both enthusiasts and investors in a market that presents itself grounded on numerous complexities of digital currency. Employing feature selection enchantment and dynamic trio of ARIMA, LSTM, Linear Regression techniques the tool creates a mosaic for users to analyze data using artificial intelligence towards forecasts in real-time crypto universe. While users navigate the algorithmic labyrinth, they are offered a vast and glittering selection of high-quality cryptocurrencies to select. The

View Publication

(5)

(3)

Publication Date

Tue Dec 05 2023

Journal Name

Baghdad Science Journal

Indoor/Outdoor Deep Learning Based Image Classification for Object Recognition Applications

Deep learning

GoogleNet

Image classification

Indoor/outdoor

Transfer learning.

Omar Abdullatif

Mohammed Jawad

Zenah Hadi

...Show More Authors

With the rapid development of smart devices, people's lives have become easier, especially for visually disabled or special-needs people. The new achievements in the fields of machine learning and deep learning let people identify and recognise the surrounding environment. In this study, the efficiency and high performance of deep learning architecture are used to build an image classification system in both indoor and outdoor environments. The proposed methodology starts with collecting two datasets (indoor and outdoor) from different separate datasets. In the second step, the collected dataset is split into training, validation, and test sets. The pre-trained GoogleNet and MobileNet-V2 models are trained using the indoor and outdoor se

View Publication Preview PDF

(6)

Publication Date

Fri Jan 01 2021

Journal Name

Signals And Communication Technology

Survey on Twitter Sentiment Analysis: Architecture, Classifications, and Challenges

Laith

Nada Khaleel

Mahmoud

Mohamed Abd

Amir H.

...Show More Authors

View Publication

(14)

(12)

Publication Date

Fri Apr 28 2023

Journal Name

Surgical Neurology International

Neurosurgery theater-based learning: Etiquette and preparation tips for medical students

Mustafa

Jaafar

Mahmood F.

Aktham O.

Sama S.

Alkawthar M.

Hayder R.

Samer S.

...Show More Authors

View Publication Preview PDF

(1)

Publication Date

Mon Nov 21 2022

Journal Name

Sensors

Deep Learning-Based Computer-Aided Diagnosis (CAD): Applications for Medical Image Datasets

deep learning

CNN

auto-encoder

ant colony optimization

COVID-19

brain tumor

Yezi Ali

...Show More Authors

Computer-aided diagnosis (CAD) has proved to be an effective and accurate method for diagnostic prediction over the years. This article focuses on the development of an automated CAD system with the intent to perform diagnosis as accurately as possible. Deep learning methods have been able to produce impressive results on medical image datasets. This study employs deep learning methods in conjunction with meta-heuristic algorithms and supervised machine-learning algorithms to perform an accurate diagnosis. Pre-trained convolutional neural networks (CNNs) or auto-encoder are used for feature extraction, whereas feature selection is performed using an ant colony optimization (ACO) algorithm. Ant colony optimization helps to search for the bes

View Publication

(35)

(28)

Publication Date

Fri Jan 01 2021

Journal Name

Materials Today: Proceedings

Response surface methodology: A review on its applications and challenges in microbial cultures

S.J.M.

Khalid Jaber Kadhum

...Show More Authors

View Publication

(259)

(195)

Publication Date

Fri Mar 18 2022

Journal Name

Aro-the Scientific Journal Of Koya University

Detecting Deepfakes with Deep Learning and Gabor Filters

Wildan Jameel

Suhad Malallah

Ayad Rodhan

...Show More Authors

The proliferation of many editing programs based on artificial intelligence techniques has contributed to the emergence of deepfake technology. Deepfakes are committed to fabricating and falsifying facts by making a person do actions or say words that he never did or said. So that developing an algorithm for deepfakes detection is very important to discriminate real from fake media. Convolutional neural networks (CNNs) are among the most complex classifiers, but choosing the nature of the data fed to these networks is extremely important. For this reason, we capture fine texture details of input data frames using 16 Gabor filters indifferent directions and then feed them to a binary CNN classifier instead of using the red-green-blue

View Publication

(12)

(4)

1 2 3 4 ... 2254 2255 2256 2257