BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM

Umniah Hameed jaid

doi:10.5455/jjcit.71-1703265368

Details

Publication Date

Mon Jan 01 2024

Journal Name

Jordanian Journal Of Computers And Information Technology

DOI

10.5455/jjcit.71-1703265368

Choose Citation Style

Statistics

View publication

7

Statistics

BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM

Umniah Hameed jaid

...Show More Authors

Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.

View Publication

Publication Date

Mon Jan 04 2021

Journal Name

Multimedia Tools And Applications

Attention enhancement system for college students with brain biofeedback signals based on virtual reality

Marwan Kadhim Mohammed

TianHan

Rana Kadhim

Song

...Show More Authors

View Publication

(5)

(7)

Publication Date

Thu Dec 15 2016

Journal Name

Research Journal Of Applied Sciences, Engineering And Technology

Building Words Dictionary List Using Symbol Enumeration and Hashing Methodology

Safa

Loay

...Show More Authors

View Publication

(1)

Publication Date

Mon Dec 31 2012

Journal Name

Al-khwarizmi Engineering Journal

Speech Compression Using Multecirculerletet Transform

Sound

Speech Compression

MCT

DWT

Sulaiman

Ali. K.

...Show More Authors

Compressing the speech reduces the data storage requirements, leading to reducing the time of transmitting the digitized speech over long-haul links like internet. To obtain best performance in speech compression, wavelet transforms require filters that combine a number of desirable properties, such as orthogonality and symmetry.The MCT bases functions are derived from GHM bases function using 2D linear convolution .The fast computation algorithm methods introduced here added desirable features to the current transform. We further assess the performance of the MCT in speech compression application. This paper discusses the effect of using DWT and MCT (one and two dimension) on speech compression. DWT and MCT performances in terms of comp

View Publication Preview PDF

Publication Date

Fri Jul 17 2026

Journal Name

Journal Of Baghdad College Of Dentistry

Salivary microRNAs (hsa-miR-200a, hsa-miR-125a and hsa- miR-93) in relation to age, gender and histopathological parameters.

Shaimaa H

Raja H

Ban A

...Show More Authors

Background: MicroRNAs (miRNAs) are small noncoding RNAs that postâ€transcriptionally regulate gene expression by targeting specific mRNAs. The main objective of this study was measure the level of salivary (hsa-miR-200a, hsa-miR-125a and hsa- miR-93) in both oral squamous cell carcinoma and healthy controls to asses the association of them with age, gender and tumor grade materials and methods The level of three salivary microRNAs namely hsa-miR-200a, hsa-miR-125a and hsa- miR-93 were measured in saliva of patients with oral squamous cell carcinoma and healthy controls by using reveres transcription, preamplification and quantitative PCR also the general information from each patient including the age, sex and tumor grade were record

View Publication Preview PDF

Publication Date

Thu Apr 01 2021

Journal Name

Biochem. Cell. Arch.

AGE AND GENDER IMPACT ON GLYCAEMIC CONTROL, RENAL FUNCTION AND OXIDATIVE STRESS PARAMETERS IN IRAQI PATIENTS TYPE 2 DIABETES MELLITUS

Glycaemic control parameters

total antioxidant capacity

reactive oxygen species

type 2 diabetes mellitus

urea

creatinine

Hawraa

Makarim

Ali

...Show More Authors

Type 2 daibetes mellitus (T2DM) is a global concern boosted by both population growth and ageing, the majority of affected people are aged between (40- 59 year). The objective of this research was to estimate the impact of age and gender on glycaemic control parameters: Fasting blood glucose (FBC), glycated hemoglobin (HbA1C), insulin, insulin resistance (IR) and insulin sensitivity (IS), renal function parameters: urea, creatinine and oxidative stress parameters: total antioxidant capacity (TAC) and reactive oxygen species (ROS). Eighty-one random samples of T2DM patients (35 men and 46 women) were included in this study, their average age was 52.75±9.63 year. Current study found that FBG, HbA1C and IR were highly significant (P<0.01) inc

Preview PDF

(6)

Publication Date

Sat Dec 01 2018

Journal Name

Al-nahrain Journal Of Science

Image Classification Using Bag of Visual Words (BoVW)

SIFT

Euclidean distance

classification

k-nearest neighbor

Bag of Visual Words.

Rafal

...Show More Authors

In this paper two main stages for image classification has been presented. Training stage consists of collecting images of interest, and apply BOVW on these images (features extraction and description using SIFT, and vocabulary generation), while testing stage classifies a new unlabeled image using nearest neighbor classification method for features descriptor. Supervised bag of visual words gives good result that are present clearly in the experimental part where unlabeled images are classified although small number of images are used in the training process.

View Publication Preview PDF

(24)

Publication Date

Thu Jun 01 2023

Journal Name

Baghdad Science Journal

Comparison of Faster R-CNN and YOLOv5 for Overlapping Objects Recognition

Computer vision

Convolutional neural network

Faster r-cnn

Kitchen utensils

Overlapping object recognition

Yolo

Muhamad Munawar

Rozniza

Muhammad Suzuri

...Show More Authors

Classifying an overlapping object is one of the main challenges faced by researchers who work in object detection and recognition. Most of the available algorithms that have been developed are only able to classify or recognize objects which are either individually separated from each other or a single object in a scene(s), but not overlapping kitchen utensil objects. In this project, Faster R-CNN and YOLOv5 algorithms were proposed to detect and classify an overlapping object in a kitchen area. The YOLOv5 and Faster R-CNN were applied to overlapping objects where the filter or kernel that are expected to be able to separate the overlapping object in the dedicated layer of applying models. A kitchen utensil benchmark image database and

View Publication Preview PDF

(28)

(22)

Publication Date

Fri May 01 2020

Journal Name

Journal Of Engineering

Building 1D Mechanical Earth Model for Zubair Oilfield in Iraq

Oilfield

Mechanical Earth Model

Wellbore Instability

NonProductive Time Reduction

Pore Pressure Prediction

Aows Khalid

Nada Sabah

...Show More Authors

Many problems were encountered during the drilling operations in Zubair oilfield. Stuckpipe, wellbore instability, breakouts and washouts, which increased the critical limits problems, were observed in many wells in this field, therefore an extra non-productive time added to the total drilling time, which will lead to an extra cost spent. A 1D Mechanical Earth Model (1D MEM) was built to suggest many solutions to such types of problems. An overpressured zone is noticed and an alternative mud weigh window is predicted depending on the results of the 1D MEM. Results of this study are diagnosed and wellbore instability problems are predicted in an efficient way using the 1D MEM. Suitable alternative solutions are presented

View Publication Preview PDF

(3)

Publication Date

Wed Aug 01 2018

Journal Name

Engineering And Technology Journal

A Proposed Method for the Sound Recognition Process

Mustafa

...Show More Authors

View Publication

Publication Date

Mon Sep 01 2025

Journal Name

Microbial Biosystems

Harnessing cyanobacteria for a greener tomorrow: CO₂ mitigation and bioconversion to sustainable chemicals and fuels

Jeevitha

Ranjitha

Adian Khalid

Afrah Kadhim

...Show More Authors

View Publication

1 2 ... 5 6 7 8 ... 2255 2256