Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
The charge density distributions (CDD) and the elastic electron
scattering form factors F(q) of the ground state for some even mass
nuclei in the 2s 1d shell ( Ne Mg Si 20 24 28 , , and S 32 ) nuclei have
been calculated based on the use of occupation numbers of the states
and the single particle wave functions of the harmonic oscillator
potential with size parameters chosen to reproduce the observed root
mean square charge radii for all considered nuclei. It is found that
introducing additional parameters, namely 1 , and , 2 which
reflect the difference of the occupation numbers of the states from
the prediction of the simple shell model leads to a remarkable
agreement between the calculated an
Osteoarthritis (OA) is a disease of human joints, especially the knee joint, due to significant weight of the body. This disease leads to rupture and degeneration of parts of the cartilage in the knee joint, which causes severe pain. Diagnosis of this disease can be obtained through X-ray. Deep learning has become a popular solution to medical issues due to its fast progress in recent years. This research aims to design and build a classification system to minimize the burden on doctors and help radiologists to assess the severity of the pain, enable them to make an optimal diagnosis and describe the correct treatment. Deep learning-based approaches, such as Convolution Neural Networks (CNNs), have been used to detect knee OA usin
... Show MoreSpraying pesticides is one of the most common procedures that is conducted to control pests. However, excessive use of these chemicals inversely affects the surrounding environments including the soil, plants, animals, and the operator itself. Therefore, researchers have been encouraged to...
Clinical keratoconus (KCN) detection is a challenging and time-consuming task. In the diagnosis process, ophthalmologists must revise demographic and clinical ophthalmic examinations. The latter include slit-lamb, corneal topographic maps, and Pentacam indices (PI). We propose an Ensemble of Deep Transfer Learning (EDTL) based on corneal topographic maps. We consider four pretrained networks, SqueezeNet (SqN), AlexNet (AN), ShuffleNet (SfN), and MobileNet-v2 (MN), and fine-tune them on a dataset of KCN and normal cases, each including four topographic maps. We also consider a PI classifier. Then, our EDTL method combines the output probabilities of each of the five classifiers to obtain a decision b
Extracting moving object from video sequence is one of the most important steps
in the video-based analysis. Background subtraction is the most commonly used
moving object detection methods in video, in which the extracted object will be
feed to a higher-level process ( i.e. object localization, object tracking ).
The main requirement of background subtraction method is to construct a
stationary background model and then to compare every new coming frame with it
in order to detect the moving object.
Relied on the supposition that the background occurs with the higher appearance
frequency, a proposed background reconstruction algorithm has been presented
based on pixel intensity classification ( PIC ) approach.
The aim of this research is to find a relation between self-protection and the social - ignorance of the univresity students. In applying the aims of the reaearch, the ressearcher has constructed two scales to measure
self - protection and the social - ignorance. After finding their validity and stability and their discriminative power, the researcher has applied them on a sample of (200) male and female. University students, who were selected randomly. The results of the research has arrived at finding a positive relation between self-protection and social - ignorance.
The researcher has recommended a concentration on the role of parents in raising their childern depending on themselves and making f
... Show MoreHowever, the effects of these ideas are still evident
Think of those who follow the footsteps of Muslim scholars and thinkers.
Intellectual source of Gnostic Gnar
And in dealing with issues of Islamic thought
Annalisa intersecting trends of thought, Minya who melts in Masarya and supported him and believes in Bo and extremist Vue,
Lea, trying to throw it with a few times, looks at Elya look
There are those who stand in the opposite position
There are those who stand in a selective compromise, but this school of thought remains
G - the features of Islamic thought that believes in the mind and Imoto in the defense of
An intellectual station and generalized bar
Creed. In his study, the researcher wi
This paper proposes a new approach, of Clustering Ultrasound images using the Hybrid Filter (CUHF) to determine the gender of the fetus in the early stages. The possible advantage of CUHF, a better result can be achieved when fuzzy c-mean FCM returns incorrect clusters. The proposed approach is conducted in two steps. Firstly, a preprocessing step to decrease the noise presented in ultrasound images by applying the filters: Local Binary Pattern (LBP), median, median and discrete wavelet (DWT),(median, DWT & LBP) and (median & Laplacian) ML. Secondly, implementing Fuzzy C-Mean (FCM) for clustering the resulted images from the first step. Amongst those filters, Median & Laplace has recorded a better accuracy. Our experimental evaluation on re
... Show MoreThe disposal of the waste material is the main goal of this investigation by transformation to high-fineness powder and producing self-consolidation concrete (SCC) with less cost and more eco-friendly by reducing the cement weight, taking into consideration the fresh and strength properties. The reference mix design was prepared by adopting the European guide. Five waste materials (clay brick, ceramic, granite tiles, marble tiles, and thermostone blocks) were converted to high-fine particle size distribution and then used as 5, 10, and 15% weight replacements of cement. The improvement in strength properties is more significant when using clay bricks compared to other activated waste