Preferred Language
Articles
/
kxfrNY8BVTCNdQwC2GI_
BEYOND WORDS: HARNESSING SPEECH SOUND FOR SPEAKER AGE AND GENDER DETECTION USING 1D CNN ARCHITECTURE WITH SELF-ATTENTION MECHANISM
...Show More Authors

Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.

Scopus Crossref
View Publication
Publication Date
Sat Dec 02 2017
Journal Name
Al-khwarizmi Engineering Journal
Speech Signal Compression Using Wavelet And Linear Predictive Coding
...Show More Authors

A new algorithm is proposed to compress speech signals using wavelet transform and linear predictive coding. Signal compression based on the concept of selecting a small number of approximation coefficients after they are compressed by the wavelet decomposition (Haar and db4) at a suitable chosen level and ignored details coefficients, and then approximation coefficients are windowed by a rectangular window and fed to the linear predictor. Levinson Durbin algorithm is used to compute LP coefficients, reflection coefficients and predictor error. The compress files contain LP coefficients and previous sample. These files are very small in size compared to the size of the original signals. Compression ratio is calculated from the size of th

... Show More
View Publication Preview PDF
Publication Date
Sun Jan 01 2023
Journal Name
Journal Of Robotics And Control (jrc)
Automated Stand-alone Surgical Safety Evaluation for Laparoscopic Cholecystectomy (LC) using Convolutional Neural Network and Constrained Local Models (CNN-CLM)
...Show More Authors

In this golden age of rapid development surgeons realized that AI could contribute to healthcare in all aspects, especially in surgery. The aim of the study will incorporate the use of Convolutional Neural Network and Constrained Local Models (CNN-CLM) which can make improvement for the assessment of Laparoscopic Cholecystectomy (LC) surgery not only bring opportunities for surgery but also bring challenges on the way forward by using the edge cutting technology. The problem with the current method of surgery is the lack of safety and specific complications and problems associated with safety in each laparoscopic cholecystectomy procedure. When CLM is utilize into CNN models, it is effective at predicting time series tasks like iden

... Show More
View Publication
Scopus Crossref
Publication Date
Sat Apr 01 2023
Journal Name
Baghdad Science Journal
Interior Visual Intruders Detection Module Based on Multi-Connect Architecture MCA Associative Memory
...Show More Authors

Most recent studies have focused on using modern intelligent techniques spatially, such as those
developed in the Intruder Detection Module (IDS). Such techniques have been built based on modern
artificial intelligence-based modules. Those modules act like a human brain. Thus, they should have had the
ability to learn and recognize what they had learned. The importance of developing such systems came after
the requests of customers and establishments to preserve their properties and avoid intruders’ damage. This
would be provided by an intelligent module that ensures the correct alarm. Thus, an interior visual intruder
detection module depending on Multi-Connect Architecture Associative Memory (MCA)

... Show More
View Publication Preview PDF
Scopus Crossref
Publication Date
Sat May 08 2021
Journal Name
Iraqi Journal Of Science
EEG Signals Analysis for Epileptic Seizure Detection Using DWT Method with SVM and KNN Classifiers
...Show More Authors

Epilepsy is a critical neurological disorder with critical influences on the way of living of its victims and prominent features such as persistent convulsion periods followed by unconsciousness. Electroencephalogram (EEG) is one of the commonly used devices for seizure recognition and epilepsy detection. Recognition of convulsions using EEG waves takes a relatively long time because it is conducted physically by epileptologists. The EEG signals are analyzed and categorized, after being captured, into two types, which are normal or abnormal (indicating an epileptic seizure).  This study relies on EEG signals which are provided by Arrhythmia Database. Thus, this work is a step beyond the traditional database mission of delivering use

... Show More
View Publication Preview PDF
Scopus (4)
Crossref (2)
Scopus Crossref
Publication Date
Fri Mar 29 2024
Journal Name
Iraqi Journal Of Science
Evaluating the Performance and Behavior of CNN, LSTM, and GRU for Classification and Prediction Tasks
...Show More Authors

     Deep learning (DL) plays a significant role in several tasks, especially classification and prediction. Classification tasks can be efficiently achieved via convolutional neural networks (CNN) with a huge dataset, while recurrent neural networks (RNN) can perform prediction tasks due to their ability to remember time series data. In this paper, three models have been proposed to certify the evaluation track for classification and prediction tasks associated with four datasets (two for each task). These models are CNN and RNN, which include two models (Long Short Term Memory (LSTM)) and GRU (Gated Recurrent Unit). Each model is employed to work consequently over the two mentioned tasks to draw a road map of deep learning mod

... Show More
View Publication
Scopus (3)
Scopus Crossref
Publication Date
Tue May 07 2019
Journal Name
Acm Journal On Emerging Technologies In Computing Systems
Neuromemrisitive Architecture of HTM with On-Device Learning and Neurogenesis
...Show More Authors

Hierarchical temporal memory (HTM) is a biomimetic sequence memory algorithm that holds promise for invariant representations of spatial and spatio-temporal inputs. This article presents a comprehensive neuromemristive crossbar architecture for the spatial pooler (SP) and the sparse distributed representation classifier, which are fundamental to the algorithm. There are several unique features in the proposed architecture that tightly link with the HTM algorithm. A memristor that is suitable for emulating the HTM synapses is identified and a new Z-window function is proposed. The architecture exploits the concept of synthetic synapses to enable potential synapses in the HTM. The crossbar for the SP avoids dark spots caused by unutil

... Show More
View Publication
Scopus (12)
Crossref (12)
Scopus Clarivate Crossref
Publication Date
Mon Jun 05 2023
Journal Name
Journal Of Engineering
Isolated Word Speech Recognition Using Mixed Transform
...Show More Authors

Methods of speech recognition have been the subject of several studies over the past decade. Speech recognition has been one of the most exciting areas of the signal processing. Mixed transform is a useful tool for speech signal processing; it is developed for its abilities of improvement in feature extraction. Speech recognition includes three important stages, preprocessing, feature extraction, and classification. Recognition accuracy is so affected by the features extraction stage; therefore different models of mixed transform for feature extraction were proposed. The properties of the recorded isolated word will be 1-D, which achieve the conversion of each 1-D word into a 2-D form. The second step of the word recognizer requires, the

... Show More
View Publication Preview PDF
Crossref (1)
Crossref
Publication Date
Thu Nov 01 2018
Journal Name
2018 1st Annual International Conference On Information And Sciences (aicis)
Speech Emotion Recognition Using Minimum Extracted Features
...Show More Authors

Recognizing speech emotions is an important subject in pattern recognition. This work is about studying the effect of extracting the minimum possible number of features on the speech emotion recognition (SER) system. In this paper, three experiments performed to reach the best way that gives good accuracy. The first one extracting only three features: zero crossing rate (ZCR), mean, and standard deviation (SD) from emotional speech samples, the second one extracting only the first 12 Mel frequency cepstral coefficient (MFCC) features, and the last experiment applying feature fusion between the mentioned features. In all experiments, the features are classified using five types of classification techniques, which are the Random Forest (RF),

... Show More
View Publication Preview PDF
Scopus (2)
Crossref (1)
Scopus Clarivate Crossref
Publication Date
Wed Jan 13 2021
Journal Name
Iraqi Journal Of Science
YouTube Keyword Search Engine Using Speech Recognition
...Show More Authors

Visual media is a better way to deliver the information than the old way of "reading". For that reason with the wide propagation of multimedia websites, there are large video library’s archives, which came to be a main resource for humans. This research puts its eyes on the existing development in applying classical phrase search methods to a linked vocal transcript and after that it retrieves the video, this an easier way to search any visual media. This system has been implemented using JSP and Java language for searching the speech in the videos

View Publication Preview PDF
Scopus (2)
Scopus Crossref
Publication Date
Sat Feb 02 2019
Journal Name
Journal Of The College Of Education For Women
Poetical Words
...Show More Authors

Poetical Words

View Publication Preview PDF