Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for many applications dismissing the use of DL. Having sufficient data is the first step toward any successful and trustworthy DL application. This paper presents a holistic survey on state-of-the-art techniques to deal with training DL models to overcome three challenges including small, imbalanced datasets, and lack of generalization. This survey starts by listing the learning techniques. Next, the types of DL architectures are introduced. After that, state-of-the-art solutions to address the issue of lack of training data are listed, such as Transfer Learning (TL), Self-Supervised Learning (SSL), Generative Adversarial Networks (GANs), Model Architecture (MA), Physics-Informed Neural Network (PINN), and Deep Synthetic Minority Oversampling Technique (DeepSMOTE). Then, these solutions were followed by some related tips about data acquisition needed prior to training purposes, as well as recommendations for ensuring the trustworthiness of the training dataset. The survey ends with a list of applications that suffer from data scarcity, several alternatives are proposed in order to generate more data in each application including Electromagnetic Imaging (EMI), Civil Structural Health Monitoring, Medical imaging, Meteorology, Wireless Communications, Fluid Mechanics, Microelectromechanical system, and Cybersecurity. To the best of the authors’ knowledge, this is the first review that offers a comprehensive overview on strategies to tackle data scarcity in DL.
Today, there are large amounts of geospatial data available on the web such as Google Map (GM), OpenStreetMap (OSM), Flickr service, Wikimapia and others. All of these services called open source geospatial data. Geospatial data from different sources often has variable accuracy due to different data collection methods; therefore data accuracy may not meet the user requirement in varying organization. This paper aims to develop a tool to assess the quality of GM data by comparing it with formal data such as spatial data from Mayoralty of Baghdad (MB). This tool developed by Visual Basic language, and validated on two different study areas in Baghdad / Iraq (Al-Karada and Al- Kadhumiyah). The positional accuracy was asses
... Show MoreThree hundred samples of washing water of vegetables were collected from women aged ( 15- 6o) years from different area in Baghdad governorate and its suburbs include two rural area ( Jaddria in Baghdad university and Al –Wagif in Rashdia) and two urbane area (Mansoure and Escan) . The samples were examined by precipitation method and then by staining method ( Lugols –Iodine stain) . The percentage of infection of intestinal parasites 36.3% include 15.3% for urban area and 57.3% in rural area and a significant difference was found between those groups . .The results showed also increased in the prevalence of parasitic infection in group age (15 -30) year .Also the results showed only 109 sample infected with eight specie
... Show MoreClinical keratoconus (KCN) detection is a challenging and time-consuming task. In the diagnosis process, ophthalmologists must revise demographic and clinical ophthalmic examinations. The latter include slit-lamb, corneal topographic maps, and Pentacam indices (PI). We propose an Ensemble of Deep Transfer Learning (EDTL) based on corneal topographic maps. We consider four pretrained networks, SqueezeNet (SqN), AlexNet (AN), ShuffleNet (SfN), and MobileNet-v2 (MN), and fine-tune them on a dataset of KCN and normal cases, each including four topographic maps. We also consider a PI classifier. Then, our EDTL method combines the output probabilities of each of the five classifiers to obtain a decision b
During the two last decades ago, audio compression becomes the topic of many types of research due to the importance of this field which reflecting on the storage capacity and the transmission requirement. The rapid development of the computer industry increases the demand for audio data with high quality and accordingly, there is great importance for the development of audio compression technologies, lossy and lossless are the two categories of compression. This paper aims to review the techniques of the lossy audio compression methods, summarize the importance and the uses of each method.
The hydrological process has a dynamic nature characterised by randomness and complex phenomena. The application of machine learning (ML) models in forecasting river flow has grown rapidly. This is owing to their capacity to simulate the complex phenomena associated with hydrological and environmental processes. Four different ML models were developed for river flow forecasting located in semiarid region, Iraq. The effectiveness of data division influence on the ML models process was investigated. Three data division modeling scenarios were inspected including 70%–30%, 80%–20, and 90%–10%. Several statistical indicators are computed to verify the performance of the models. The results revealed the potential of the hybridized s
... Show More— To identify the effect of deep learning strategy on mathematics achievement and practical intelligence among secondary school students during the 2022/2023 academic year. In the research, the experimental research method with two groups (experimental and control) with a post-test were adopted. The research community is represented by the female students of the fifth scientific grade from the first Karkh Education Directorate. (61) female students were intentionally chosen, and they were divided into two groups: an experimental group (30) students who were taught according to the proposed strategy, and a control group (31) students who were taught according to the usual method. For the purpose of collecting data for the experimen
... Show MoreIn this paper, simulation studies and applications of the New Weibull-Inverse Lomax (NWIL) distribution were presented. In the simulation studies, different sample sizes ranging from 30, 50, 100, 200, 300, to 500 were considered. Also, 1,000 replications were considered for the experiment. NWIL is a fat tail distribution. Higher moments are not easily derived except with some approximations. However, the estimates have higher precisions with low variances. Finally, the usefulness of the NWIL distribution was illustrated by fitting two data sets
The historical center's landscape suffers from neglect, despite their importance and broad capabilities in enhancing the cultural value of the historical center, as landscape includes many heterogeneous human and non-human components, material and immaterial, natural and manufactured, also different historical layers, ancient, modern and contemporary. Due to the difference in these components and layers, it has become difficult for the designer to deal with it. Therefore, the research was directed by following a methodology of actor-network theory as it deals with such a complex system and concerned with an advanced method to connect the various components of considering landscape as a ground that can include various elements and deal wi
... Show MoreA water crisis is a circumstance in which a region accessible potable, unpolluted water is less than the requirement of that country. Two converging trends cause water scarcity, that are expanded use of irrigation, and loss of available freshwater supplies. Water scarcity can arise from two mechanisms, the physical water scarcity because of deficient natural water supply to fulfil the country demand, and economic water scarcity due to bad management for sufficient available water resources. This research examines data set as multispectral Landsat 8 satellite images that are detected for Basrah city, located in southern Iraq, and positioned between Kuwait and Iran on the Shatt al-Arab. Such raw data are satellite images. Using ENVI 5.3 softw
... Show MoreResearch Summary
In The Name of Allah Most Gracious Most Merciful
The word injustice and its derivatives were repeated in the Holy Qur’an in several places, approximately (154) times. This is due to the severity of its danger, and that the most dangerous thing that our Islamic nation suffers from in our time is; It is injustice in all its forms and types, so we should all have an honest review of the sincere change in the right direction, and uncover cases of injustice and explain their causes and causes, and work to treat them and rid the wrongdoers of their injustice, and help them to correct their condition. To reveal their grievances and explain their causes and causes, and work to remedy them, and support them and mi
... Show More