Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
Two prevalent neurodevelopment disorders in children are attention deficit hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). The fifth version of the Diagnostic and Statistical Manual of Mental Disorders describes autism as a condition marked by limitations in social communication as well as restricted, repetitive behavior patterns. While impulsivity, hyperactivity, and lack of concentration are signs of attention deficit hyperactivity disorder. Boys experience it more frequently than girls do. This study sought for possible factors that put children at risk for autism and attention deficit hyperactivity disorder, and it investigated the association between neurodevelopment disorders in children and parental risk factor i
... Show MoreMicro-perforated panel (MPP) absorber is increasingly gaining popularity as an alternative sound absorber in buildings compared to the well-known synthetic porous materials. A single MPP has a typical feature of a Helmholtz resonator with a high amplitude of absorption but a narrow absorption frequency bandwidth. To improve the bandwidth, a single MPP can be cascaded with another single MPP to form a double-layer MPP. This paper proposes the introduction of inhomogeneous perforation in the double-layer MPP system (DL-iMPP) to enhance the absorption bandwidth of a double-layer MPP. Mathematical models are proposed using the equivalent electrical circuit model and are validated with experiments with good agreement. It is revealed that the DL-
... Show Moreزاد الاهتمام بالأطفال ذوي اضطراب الانتباه المصحوب بالنشاط الزائد نظراً لانتشاره بين الأطفال في عمر المرحلة الابتدائية حيث تراوحت نسبته ما بين 3% إلى 20% ومعظمهم من الذكور ، وأن انتشاره يقع في مختلف الطبقات الاجتماعية بالنسبة لعوائل هؤلاء الأطفال كما أن المشكلات المتعلقة به لا تنتهي بانتهاء مرحلة الطفولة ، وغالباً ما تمتد إلى مرحلة المراهقة حيث توصل ويز و هتكمانWeiss&Hechtman,1989 إلى أن هناك علامات م
... Show MoreIn the current worldwide health crisis produced by coronavirus disease (COVID-19), researchers and medical specialists began looking for new ways to tackle the epidemic. According to recent studies, Machine Learning (ML) has been effectively deployed in the health sector. Medical imaging sources (radiography and computed tomography) have aided in the development of artificial intelligence(AI) strategies to tackle the coronavirus outbreak. As a result, a classical machine learning approach for coronavirus detection from Computerized Tomography (CT) images was developed. In this study, the convolutional neural network (CNN) model for feature extraction and support vector machine (SVM) for the classification of axial
... Show MoreAutomatic speaker recognition may achieve remarkable performance in matched training and test conditions. Conversely, results drop significantly in incompatible noisy conditions. Furthermore, feature extraction significantly affects performance. Mel-frequency cepstral coefficients MFCCs are most commonly used in this field of study. The literature has reported that the conditions for training and testing are highly correlated. Taken together, these facts support strong recommendations for using MFCC features in similar environmental conditions (train/test) for speaker recognition. However, with noise and reverberation present, MFCC performance is not reliable. To address this, we propose a new feature 'entrocy' for accurate and robu
... Show More