Preferred Language
Articles
/
kxfrNY8BVTCNdQwC2GI_
BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM
...Show More Authors

Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.

Scopus Crossref
View Publication
Publication Date
Tue Aug 01 2023
Journal Name
International Journal Of Online And Biomedical Engineering (ijoe)
End-to-End Speaker Profiling Using 1D CNN Architectures and Filter Bank Initialization
...Show More Authors

The automatic estimation of speaker characteristics, such as height, age, and gender, has various applications in forensics, surveillance, customer service, and many human-robot interaction applications. These applications are often required to produce a response promptly. This work proposes a novel approach to speaker profiling by combining filter bank initializations, such as continuous wavelets and gammatone filter banks, with one-dimensional (1D) convolutional neural networks (CNN) and residual blocks. The proposed end-to-end model goes from the raw waveform to an estimated height, age, and gender of the speaker by learning speaker representation directly from the audio signal without relying on handcrafted and pre-computed acou

... Show More
View Publication
Scopus (1)
Scopus Clarivate Crossref
Publication Date
Fri Dec 29 2017
Journal Name
Ibn Al-haitham Journal For Pure And Applied Sciences
Speaker Verification Using Hybrid Scheme for Arabic Speech
...Show More Authors

In this work , a hybrid scheme tor Arabic speech for the recognition

of  the speaker  verification  is presented  . The scheme is hybrid as utilizes the traditional digi tal signal processi ng and neural network . Kohonen neural  network has been used as a recognizer  tor speaker verification after extract spectral  features from an acoustic signal  by Fast Fourier Transformation Algorithm(FFT) .

The system was im plemented using a PENTIUM  processor , I000

MHZ compatible and MS-dos 6.2 .

 

View Publication Preview PDF
Publication Date
Mon Feb 27 2023
Journal Name
Tem Journal
Predicting Age and Gender Using AlexNet
...Show More Authors

Due to the availability of technology stemming from in-depth research in this sector and the drawbacks of other identifying methods, biometrics has drawn maximum attention and established itself as the most reliable alternative for recognition in recent years. Efforts are still being made to develop a user-friendly system that is up to par with security-system requirements and yields more reliable outcomes while safeguarding assets and ensuring privacy. Human age estimation and Gender identification are both challenging endeavours. Biomarkers and methods for determining biological age and gender have been extensively researched, and each has advantages and disadvantages. Facial-image-based positioning is crucial for many application

... Show More
View Publication
Scopus (4)
Scopus Clarivate Crossref
Publication Date
Thu Feb 07 2019
Journal Name
Journal Of The College Of Education For Women
SPEECH RECOGNITION OF ARABIC WORDS USING ARTIFICIAL NEURAL NETWORKS
...Show More Authors

The speech recognition system has been widely used by many researchers using different
methods to fulfill a fast and accurate system. Speech signal recognition is a typical
classification problem, which generally includes two main parts: feature extraction and
classification. In this paper, a new approach to achieve speech recognition task is proposed by
using transformation techniques for feature extraction methods; namely, slantlet transform
(SLT), discrete wavelet transforms (DWT) type Daubechies Db1 and Db4. Furthermore, a
modified artificial neural network (ANN) with dynamic time warping (DTW) algorithm is
developed to train a speech recognition system to be used for classification and recognition
purposes. T

... Show More
View Publication Preview PDF
Publication Date
Sat Jan 01 2022
Journal Name
Proceedings Of International Conference On Computing And Communication Networks
Speech Gender Recognition Using a Multilayer Feature Extraction Method
...Show More Authors

View Publication
Scopus (1)
Crossref (1)
Scopus Clarivate Crossref
Publication Date
Sun Jan 01 2023
Journal Name
Ieee Access
Fuzzy-Based Ensemble Feature Selection for Automated Estimation of Speaker Height and Age Using Vocal Characteristics
...Show More Authors

View Publication
Scopus Clarivate Crossref
Publication Date
Thu Oct 01 2020
Journal Name
Ieee Transactions On Artificial Intelligence
Recursive Multi-Signal Temporal Fusions With Attention Mechanism Improves EMG Feature Extraction
...Show More Authors

View Publication
Scopus (22)
Crossref (19)
Scopus Crossref
Publication Date
Sun Feb 10 2019
Journal Name
Iraqi National Journal Of Nursing Specialties
Self-Esteem and its Relationship with the Age, Gender and academic Achievement among the students of the south Iraq Colleges of Nursing
...Show More Authors

Abstract
Objectives: this study aims to: (1). Assess self-esteem level and academic achievement for students of nursing colleges in southern Iraq. (2). Determine the relationship between levels of self-esteem and academic achievement of the student in the first semester. (3). Identify differences of self-esteem with gender and different age groups.
Methodology: a sample of (426 students) was purposively selected then collected by using a questionnaire which consisted of: I- Sociodemographic characteristics for assessing some important aspects of students, II- Rosenberg's Self-Esteem Scale (RSES) III- Iraq Grading Scale for assessing student achievement. Finally statistical analysis (SPSS) for data processing.
Results: study resu

... Show More
View Publication Preview PDF
Publication Date
Sat Jan 01 2022
Journal Name
Proceedings Of International Conference On Computing And Communication Networks
Speech Age Estimation Using a Ranking Convolutional Neural Network
...Show More Authors

View Publication
Scopus (1)
Scopus Clarivate Crossref
Publication Date
Sun Jun 06 2010
Journal Name
Baghdad Science Journal
Using Neural Network with Speaker Applications
...Show More Authors

In Automatic Speech Recognition (ASR) the non-linear data projection provided by a one hidden layer Multilayer Perceptron (MLP), trained to recognize phonemes, and has previous experiments to provide feature enhancement substantially increased ASR performance, especially in noise. Previous attempts to apply an analogous approach to speaker identification have not succeeded in improving performance, except by combining MLP processed features with other features. We present test results for the TIMIT database which show that the advantage of MLP preprocessing for open set speaker identification increases with the number of speakers used to train the MLP and that improved identification is obtained as this number increases beyond sixty.

... Show More
View Publication Preview PDF
Crossref