BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM

Umniah Hameed jaid

doi:10.5455/jjcit.71-1703265368

Details

Publication Date

Mon Jan 01 2024

Journal Name

Jordanian Journal Of Computers And Information Technology

DOI

10.5455/jjcit.71-1703265368

Choose Citation Style

Statistics

View publication

7

Statistics

BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM

Umniah Hameed jaid

...Show More Authors

Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.

View Publication

Publication Date

Mon Jan 01 2024

Journal Name

Journal Of Engineering

Face-based Gender Classification Using Deep Learning Model

Alex-Net

CLAHE

Deep learning

Gender Classification

Buraq Abed Ruda

Faten Abed Ali

...Show More Authors

Gender classification is a critical task in computer vision. This task holds substantial importance in various domains, including surveillance, marketing, and human-computer interaction. In this work, the face gender classification model proposed consists of three main phases: the first phase involves applying the Viola-Jones algorithm to detect facial images, which includes four steps: 1) Haar-like features, 2) Integral Image, 3) Adaboost Learning, and 4) Cascade Classifier. In the second phase, four pre-processing operations are employed, namely cropping, resizing, converting the image from(RGB) Color Space to (LAB) color space, and enhancing the images using (HE, CLAHE). The final phase involves utilizing Transfer lea

View Publication Preview PDF

(6)

(2)

Publication Date

Sat Jun 15 2019

Journal Name

Journal Of Baghdad College Of Dentistry

Investigation of the consistency between reported chief complaint and periodontal health status of Iraqi patients in relation to age and gender (A retrospective study)

ali

hyder

Ahmed

saif

...Show More Authors

Publication Date

Sat Jun 15 2019

Journal Name

Journal Of Baghdad College Of Dentistry

Investigation of the consistency between reported chief complaint and periodontal health status of Iraqi patients in relation to age and gender (A retrospective study)

Ali A

Hayder R

Ahmed K

Saif S

...Show More Authors

Background: Chief complaint of patients attending dental clinic represents the first step towards treatment plan. However, most of patients are not aware but the extent and severity of periodontal disease, which could be also, misdiagnose by the dentist. Aim of the study: To investigate whether reported chief complaint(s) are consistent with oral hygiene status Materials and methods: Records of 1102 patients, attending periodontics clinics in the college of dentistry/ university of Baghdad, were used to determine ten most commonly reported chief complaints. Sample of patients was further subdivided according to gender and age. In addition, plaque and gingival index were recorded to determine oral hygiene status. Results: Patients mostly

View Publication

(4)

(5)

Publication Date

Fri Mar 29 2024

Journal Name

Iraqi Journal Of Science

Evaluating the Performance and Behavior of CNN, LSTM, and GRU for Classification and Prediction Tasks

Hasanen S.

Nada Hussain

Nada A.Z.

...Show More Authors

Deep learning (DL) plays a significant role in several tasks, especially classification and prediction. Classification tasks can be efficiently achieved via convolutional neural networks (CNN) with a huge dataset, while recurrent neural networks (RNN) can perform prediction tasks due to their ability to remember time series data. In this paper, three models have been proposed to certify the evaluation track for classification and prediction tasks associated with four datasets (two for each task). These models are CNN and RNN, which include two models (Long Short Term Memory (LSTM)) and GRU (Gated Recurrent Unit). Each model is employed to work consequently over the two mentioned tasks to draw a road map of deep learning mod

View Publication

(16)

(7)

Publication Date

Sat Dec 30 2023

Journal Name

Traitement Du Signal

Optimizing Acoustic Feature Selection for Estimating Speaker Traits: A Novel Threshold-Based Approach

Umniah

...Show More Authors

View Publication

(1)

Publication Date

Tue Oct 30 2018

Journal Name

Risalat Al-huquq Journal

Legal Protection for Producers of Sound Recordings under Iraqi Law

الحماية

القانونية

منتجي

التسجيلات

الصوتية

HAIDER FALIH

...Show More Authors

Piracy on phonograms is now, rightly, the crime of the electronic age. Despite the protection sought by States to provide for such registrations, whether at the level of national legislation or international agreements and conventions, but piracy has been and continues to pose a significant threat to the rights of the producers of those recordings, especially as it is a profitable way for hackers to get a lot of money in a way Illegal, which is contrary to the rules of legitimate competition. Hence, this research highlights the legal protection of producers of phonograms in light of the Iraqi Copyright Protection Act No. (3) of 1971, as amended.

View Publication

Publication Date

Mon Jun 01 2015

Journal Name

Journal Of Engineering

Structural Systems for Modern Architecture in Iraq Analysis Study to Dr. Qahtan Al-Madfa’i’s Architecture

Kahtan Al Madfai

Dynamic

Modernism

Technology

Saddle shapes

Sculpture

Structural.

Mohammed Ridha Shakir

...Show More Authors

Dr. Qahtan Al-Madfa’i’s architecture has been characterized by a particular characteristic that may be unique and extreme at the same time, that is the use of the distinctive three-dimensional structural coverings and the exploitation of structural construction to give an extra aesthetic touch to the composition of the building, to achieve the application of his universal ideas, which he strongly believed and defended.

In the period of the marked urban decline that the country undergoes now, which urges us toward making a comparison between the beginning of the modern Iraqi architecture and its ascending path up to its peak and the periods of its decline until it reached a very

View Publication Preview PDF

Publication Date

Sun Feb 25 2024

Journal Name

Baghdad Science Journal

Hybrid CNN-based Recommendation System

CNN

deep learning

Recommendation systems

Social networks

Social recommendation

Muhammad

Roliana

Ali

...Show More Authors

Recommendation systems are now being used to address the problem of excess information in several sectors such as entertainment, social networking, and e-commerce. Although conventional methods to recommendation systems have achieved significant success in providing item suggestions, they still face many challenges, including the cold start problem and data sparsity. Numerous recommendation models have been created in order to address these difficulties. Nevertheless, including user or item-specific information has the potential to enhance the performance of recommendations. The ConvFM model is a novel convolutional neural network architecture that combines the capabilities of deep learning for feature extraction with the effectiveness o

View Publication Preview PDF

(9)

(7)

Publication Date

Fri Jun 30 2017

Journal Name

Journal Of Engineering

Enhancing Performance of Self–Compacting Concrete with Internal Curing Using Thermostone Chips

self-compacting concrete

internal curing

thermostone chips

silica fume

Nada Mahdi

Amar Yahia Ebrahem

...Show More Authors

This paper is devoted to investigate the effect of internal curing technique on the properties of self-compacting concrete (SCC). In this study, SCC is produced by using silica fume (SF) as partial replacement by weight of cement with percentage of (5%), sand is partially replaced by volume with saturated fine lightweight aggregate (LWA) which is thermostone chips as internal curing material in three percentages of (5%, 10% and 15%) for SCC, two external curing conditions water and air. The experimental work was divided into three parts: in the first part, the workability tests of fresh SCC were conducted. The second part included conducting compressive strength test and modulus of rupture test at ages of (7, 28 and 90). The third part i

View Publication Preview PDF

Publication Date

Sun Aug 18 2019

Journal Name

Proceedings Of Mechanical Engineering Research Day 2019,

Study of sound absorption of micro perforated panel with visco-thermal effects

Sound absorption

inhomogeneous MPP

finite element method (FEM)

Esraa

Azma

Ali

R. M.

...Show More Authors

1 2 ... 8 9 10 11 ... 2255 2256