Preferred Language
Articles
/
7hb2-okBVTCNdQwCe46x
A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications
...Show More Authors
Abstract<p>Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.</p>
Scopus Clarivate Crossref
View Publication Preview PDF
Quick Preview PDF
Publication Date
Fri Apr 26 2019
Journal Name
Journal Of Contemporary Medical Sciences
Breast Cancer Decisive Parameters for Iraqi Women via Data Mining Techniques
...Show More Authors

Objective This research investigates Breast Cancer real data for Iraqi women, these data are acquired manually from several Iraqi Hospitals of early detection for Breast Cancer. Data mining techniques are used to discover the hidden knowledge, unexpected patterns, and new rules from the dataset, which implies a large number of attributes. Methods Data mining techniques manipulate the redundant or simply irrelevant attributes to discover interesting patterns. However, the dataset is processed via Weka (The Waikato Environment for Knowledge Analysis) platform. The OneR technique is used as a machine learning classifier to evaluate the attribute worthy according to the class value. Results The evaluation is performed using

... Show More
View Publication Preview PDF
Crossref (2)
Crossref
Publication Date
Tue Nov 01 2016
Journal Name
Iosr Journal Of Computer Engineering
Implementation of new Secure Mechanism for Data Deduplication in Hybrid Cloud
...Show More Authors

Cloud computing provides huge amount of area for storage of the data, but with an increase of number of users and size of their data, cloud storage environment faces earnest problem such as saving storage space, managing this large data, security and privacy of data. To save space in cloud storage one of the important methods is data deduplication, it is one of the compression technique that allows only one copy of the data to be saved and eliminate the extra copies. To offer security and privacy of the sensitive data while supporting the deduplication, In this work attacks that exploit the hybrid cloud deduplication have been identified, allowing an attacker to gain access to the files of other users based on very small hash signatures of

... Show More
View Publication Preview PDF
Publication Date
Tue Oct 01 2019
Journal Name
Journal Of Engineering
Characterization Performance of Monocrystalline Silicon Photovoltaic Module Using Experimentally Measured Data
...Show More Authors

Solar photovoltaic (PV) system has emerged as one of the most promising technology to generate clean energy. In this work, the performance of monocrystalline silicon photovoltaic module is studied through observing the effect of necessary parameters: solar irradiation and ambient temperature. The single diode model with series resistors is selected to find the characterization of current-voltage (I-V) and power-voltage (P-V) curves by determining the values of five parameters ( ). This model shows a high accuracy in modeling the solar PV module under various weather conditions. The modeling is simulated via using MATLAB/Simulink software. The performance of the selected solar PV module is tested experimentally for differ

... Show More
View Publication Preview PDF
Crossref
Publication Date
Fri Dec 15 2023
Journal Name
Al-academy
Aesthetics Contents of Data Visualization as an Input to its humanization
...Show More Authors

The aesthetic contents of data visualization is one of the contemporary areas through which data scientists and designers have been able to link data to humans, and even after reaching successful attempts to model data visualization, it wasn't clear how that reveals how it contributed to choosing the aesthetic content as an input to humanize these models, so the goal of the current research is to use The analytical descriptive approach aims to identify the aesthetic contents in data visualization, which the researchers interpreted through pragmatic philosophy and Kantian philosophy, and analyze a sample of data visualization models to reveal the aesthetic entrances in them to explain how to humanize them. The two researchers reached seve

... Show More
View Publication Preview PDF
Crossref
Publication Date
Wed Jun 01 2022
Journal Name
Bulletin Of Electrical Engineering And Informatics
Proposed model for data protection in information systems of government institutions
...Show More Authors

Information systems and data exchange between government institutions are growing rapidly around the world, and with it, the threats to information within government departments are growing. In recent years, research into the development and construction of secure information systems in government institutions seems to be very effective. Based on information system principles, this study proposes a model for providing and evaluating security for all of the departments of government institutions. The requirements of any information system begin with the organization's surroundings and objectives. Most prior techniques did not take into account the organizational component on which the information system runs, despite the relevance of

... Show More
View Publication
Scopus (2)
Crossref (1)
Scopus Crossref
Publication Date
Tue Aug 01 2023
Journal Name
Biomedical Signal Processing And Control
Decoding transient sEMG data for intent motion recognition in transhumeral amputees
...Show More Authors

View Publication
Scopus (28)
Crossref (27)
Scopus Clarivate Crossref
Publication Date
Fri Sep 01 2023
Journal Name
Migration Letters
Organizational Machiavellianism and Its Impact on Employees’ Passion: A Field Study on a Sample of Electronic Payment Companies in Iraq
...Show More Authors

View Publication
Publication Date
Sat Aug 09 2025
Journal Name
Scientific Reports
Machine learning models for predicting morphological traits and optimizing genotype and planting date in roselle (Hibiscus Sabdariffa L.)
...Show More Authors

Accurate prediction and optimization of morphological traits in Roselle are essential for enhancing crop productivity and adaptability to diverse environments. In the present study, a machine learning framework was developed using Random Forest and Multi-layer Perceptron algorithms to model and predict key morphological traits, branch number, growth period, boll number, and seed number per plant, based on genotype and planting date. The dataset was generated from a field experiment involving ten Roselle genotypes and five planting dates. Both RF and MLP exhibited robust predictive capabilities; however, RF (R² = 0.84) demonstrated superior performance compared to MLP (R² = 0.80), underscoring its efficacy in capturing the nonlinear genoty

... Show More
View Publication Preview PDF
Scopus Clarivate Crossref
Publication Date
Tue Oct 01 2013
Journal Name
Proceedings Of The International Astronomical Union
The infrared <i>K</i>-band identification of the DSO/G2 source from VLT and Keck data
...Show More Authors
Abstract<p>A fast moving infrared excess source (G2) which is widely interpreted as a core-less gas and dust cloud approaches Sagittarius A* (Sgr A*) on a presumably elliptical orbit. VLT <italic>K<sub>s</sub></italic>-band and Keck <italic>K</italic>′-band data result in clear continuum identifications and proper motions of this ∼19<sup><italic>m</italic></sup> Dusty S-cluster Object (DSO). In 2002-2007 it is confused with the star S63, but free of confusion again since 2007. Its near-infrared (NIR) colors and a comparison to other sources in the field speak in favor of the DSO being an IR excess star with photospheric continuum emission at 2 microns than a</p> ... Show More
View Publication
Scopus (3)
Crossref (1)
Scopus Clarivate Crossref
Publication Date
Mon Apr 03 2023
Journal Name
Journal Of Electronics,computer Networking And Applied Mathematics
Comparison of Some Estimator Methods of Regression Mixed Model for the Multilinearity Problem and High – Dimensional Data
...Show More Authors

In order to obtain a mixed model with high significance and accurate alertness, it is necessary to search for the method that performs the task of selecting the most important variables to be included in the model, especially when the data under study suffers from the problem of multicollinearity as well as the problem of high dimensions. The research aims to compare some methods of choosing the explanatory variables and the estimation of the parameters of the regression model, which are Bayesian Ridge Regression (unbiased) and the adaptive Lasso regression model, using simulation. MSE was used to compare the methods.

View Publication
Crossref