A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Ali H. Al-Timemy

doi:10.1186/s40537-023-00727-2

Details

Publication Date

Fri Apr 14 2023

Journal Name

Journal Of Big Data

Volume

10

DOI

10.1186/s40537-023-00727-2

Choose Citation Style

Statistics

View publication

25

View pdf

1

Statistics

(534)

(527)

A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Ali H. Al-Timemy

...Show More Authors

Abstract<p>Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.</p>

View Publication Preview PDF

Quick Preview PDF

Publication Date

Mon Jan 01 2024

Journal Name

Aip Conference Proceedings

Non-linear support vector machine classification models using kernel tricks with applications

Classification

Logistic regression

Naïve Bayes

Slack variable

Support vector machine

Ghadeer Jasim Mohammed

Seror Faeq

...Show More Authors

The support vector machine, also known as SVM, is a type of supervised learning model that can be used for classification or regression depending on the datasets. SVM is used to classify data points by determining the best hyperplane between two or more groups. Working with enormous datasets, on the other hand, might result in a variety of issues, including inefficient accuracy and time-consuming. SVM was updated in this research by applying some non-linear kernel transformations, which are: linear, polynomial, radial basis, and multi-layer kernels. The non-linear SVM classification model was illustrated and summarized in an algorithm using kernel tricks. The proposed method was examined using three simulation datasets with different sample

View Publication Preview PDF

Publication Date

Mon Oct 11 2021

Journal Name

Nano Hybrids And Composites

Far Infrared Laser Detector Based on Multi-Walled Carbon Nanotubes and Blend of (Polyaniline - Polymethyl Methacrylate) Polymers with Methyl Blue Dye for Photoconductive Applications

Carbon nanotubes

Infrared detector

Polyaniline polymer

Polymethyl methacrylate polymer

Methyl Blue dye

Wasan

Salma M.

Samar Y.

Marwa A.

...Show More Authors

Infrared photoconductive detectors working in the far-infrared region and room temperature were fabricated. The detectors were fabricated using three types of carbon nanotubes (CNTs); MWCNTs, COOH-MWCNTs, and short-MWCNTs. The carbon nontubes suspension is deposited by dip coating and drop–casting techniques to prepare thin films of CNTs. These films were deposited on porous silicon (PSi) substrates of n-type Si. The I-V characteristics and the figures of merit of the fabricated detectors were measured at a forward bias voltage of 3 and 5 volts as well as at dark and under illumination by IR radiation from a CO2 laser of 10.6 μm wavelengths and power of 2.2 W. The responsivity and figures of merit of the photoconductive detector

View Publication

(1)

Publication Date

Wed Jan 01 2020

Journal Name

International Journal Of Advance Science And Technology

MR Images Classification of Alzheimer's Disease Based on Deep Belief Network Method

Alzheimer’s Disease

Magnetic Resonance Imaging

Deep Belief Network

Gray Level

Co-occurrence Matrix

Mohammed

...Show More Authors

Background/Objectives: The purpose of this study was to classify Alzheimer’s disease (AD) patients from Normal Control (NC) patients using Magnetic Resonance Imaging (MRI). Methods/Statistical analysis: The performance evolution is carried out for 346 MR images from Alzheimer's Neuroimaging Initiative (ADNI) dataset. The classifier Deep Belief Network (DBN) is used for the function of classification. The network is trained using a sample training set, and the weights produced are then used to check the system's recognition capability. Findings: As a result, this paper presented a novel method of automated classification system for AD determination. The suggested method offers good performance of the experiments carried out show that the

Publication Date

Sun Jan 01 2017

Journal Name

Engineering And Technology Journal

Study of the Diffusion Coefficient and Hardness for a Composite Material when Immersed in Different Solutions Polymer

S.H.

H.G.

A.W.

...Show More Authors

View Publication

(1)

Publication Date

Sat Jan 01 2022

Journal Name

Technologies And Materials For Renewable Energy, Environment And Sustainability: Tmrees21gr

Challenges facing the transition of traditional cities to smart: Studying the challenges faced by the transition of a traditional area such as Al-Kadhimiya city center to the smart style

H.M.M.

amer

...Show More Authors

Challenges facing the transition of traditional cities to smart: Studying the challenges faced by the transition of a traditional area such as Al-Kadhimiya city center to the smart style

View Publication

(5)

Publication Date

Sat Jul 26 2025

Journal Name

Arab World English Journal

YouTube as a Learning Tool Among EFL Learners: A Systematic Review

Nawal

Zina

Asmaa

...Show More Authors

This review paper examines the crucial impact of YouTube on learning English as a Foreign Language. Recently, learners’ interaction and development of their skills have been improved due to the integration of digital platforms into language education. YouTube is regarded as one of the most prevalent platforms due to its accessibility, multimodal content, and capacity to simulate real-life communication. This study tackles thirty selected research articles from various cultural and institutional backgrounds to identify the pedagogical benefits and challenges associated with using YouTube in teaching English. Conventional methods of teaching English as a foreign language encounter difficulties in improving students’ engagement and

View Publication

Publication Date

Wed Sep 30 2020

Journal Name

College Of Islamic Sciences

Arabization and its contemporary applications A jurisprudential study of the provisions and certifications of Arabization and migration

أ.م. د . عادل

...Show More Authors

The place in which the person lives and his geographical and social environment have a great impact on building his personality, belief and culture, Islam has alerted the importance of the Muslim to make sure to choose the appropriate place in which he resides and dwells in that it is compatible with his religion and belief in order to ensure communication with Islamic knowledge in a way that enhances his belief Arabization occurs when a person makes himself an Arab by living the life of the Bedouins, and creates the morals of the Bedouins from the inhabitants of the Badia with its harshness, cruelty, ignorance and lack of understanding in religion and far from the sources of knowledge of Islamic knowledge. Blasphemy and polytheism, and

View Publication Preview PDF

Publication Date

Tue Mar 08 2022

Journal Name

Multimedia Tools And Applications

Comparison study on the performance of the multi classifiers with hybrid optimal features selection method for medical data diagnosis

Sameer F.

...Show More Authors

View Publication

(3)

(4)

Publication Date

Sun Feb 10 2019

Journal Name

Journal Of The College Of Education For Women

IMPLEMENTATION OF THE SKIP LIST DATA STRUCTURE WITH IT'S UPDATE OPERATIONS

Maha shakir

...Show More Authors

A skip list data structure is really just a simulation of a binary search tree. Skip lists algorithm are simpler, faster and use less space. this data structure conceptually uses parallel sorted linked lists. Searching in a skip list is more difficult than searching in a regular sorted linked list. Because a skip list is a two dimensional data structure, it is implemented using a two dimensional network of nodes with four pointers. the implementation of the search, insert and delete operation taking a time of upto . The skip list could be modified to implement the order statistic operations of RANKand SEARCH BY RANK while maintaining the same expected time. Keywords:skip list , parallel linked list , randomized algorithm , rank.

View Publication Preview PDF

Publication Date

Wed Aug 01 2018

Journal Name

Journal Of Economics And Administrative Sciences

Compare to the conditional logistic regression models with fixed and mixed effects for longitudinal data

طريقة الإمكان الأعظم

الانحدار اللوجستي الشرطي

البيانات الطولية

نماذج التأثيرات المختلطة

معيار شبه الإمكان في ظل نموذج الاستقلال (QIC)

معيار اكايكي التجريبي (EAIC)

التلوث البيئي

التحليل العنقودي

انتصار عريبي

يوسف خليل

...Show More Authors

Mixed-effects conditional logistic regression is evidently more effective in the study of qualitative differences in longitudinal pollution data as well as their implications on heterogeneous subgroups. This study seeks that conditional logistic regression is a robust evaluation method for environmental studies, thru the analysis of environment pollution as a function of oil production and environmental factors. Consequently, it has been established theoretically that the primary objective of model selection in this research is to identify the candidate model that is optimal for the conditional design. The candidate model should achieve generalizability, goodness-of-fit, parsimony and establish equilibrium between bias and variab

View Publication Preview PDF

1 2 ... 41 42 43 44 ... 2186 2187