Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
Предметом нашего исследования является вводные слова и их значения в современном русском языке. И прежде чем углубиться в нашу научную работу нам было необходимо определить понятие вводных слов и их функция и место в системе русского языка. По словам В. Г. Лебедева и Л. С. Тюревы "Вводные слова вводятся в предложении, чтобы выражать отношение говорящего к высказываемой мысли, оценки ее содержа
... Show MoreTheoretical trends have appeared Which posed the concept of biological structures in contemporary architecture concept, Spread through the emergence of architectural production that reflect and indicate the nature of the construction structures based on the ideas and principles of biological structures in biological science: Despite the emergence of many architectural proposals that tried to explain the concept in the field of architecture, but it is not dealt with in depth and not given a comprehensive definition: So there was need to search for the concept and its beginning in the biological field as a general framework down to the field of architecture for the purpose of reducing the limits of their search framework T
... Show MoreConstruction is the opening of the important pillars of the construction of the film as a whole for this, we find that the first of any narrative of my film begin at the borders of this construction is the window that we look through the contents tale and puzzle narrative is of significance that degrade traveler when reservoirs expression later in reasoning and find justifications ills that came by those initiation, this initiation may be the window that lead us to the core, understanding the story through signals received to the recipient to sail because of the paths of pickling what is which is encoded, but this initiation may serve as keys that understanding the be puppies and signals that beset and surrounded to what He holds inevita
... Show MoreIn this paper a theoretical attempt is made to determine whether changes in the aorta diameter at different location along the aorta can be detected by brachial artery measurement. The aorta is divided into six main parts, each part with 4 lumps of 0.018m length. It is assumed that a desired section of the aorta has a radius change of 100,200, 500%. The results show that there is a significant change for part 2 (lumps 5-8) from the other parts. This indicates that the nearest position to the artery gives the significant change in the artery wave pressure while other parts of the aorta have a small effect.
Autism Spectrum Disorder, also known as ASD, is a neurodevelopmental disease that impairs speech, social interaction, and behavior. Machine learning is a field of artificial intelligence that focuses on creating algorithms that can learn patterns and make ASD classification based on input data. The results of using machine learning algorithms to categorize ASD have been inconsistent. More research is needed to improve the accuracy of the classification of ASD. To address this, deep learning such as 1D CNN has been proposed as an alternative for the classification of ASD detection. The proposed techniques are evaluated on publicly available three different ASD datasets (children, Adults, and adolescents). Results strongly suggest that 1D
... Show MoreDigital image manipulation has become increasingly prevalent due to the widespread availability of sophisticated image editing tools. In copy-move forgery, a portion of an image is copied and pasted into another area within the same image. The proposed methodology begins with extracting the image's Local Binary Pattern (LBP) algorithm features. Two main statistical functions, Stander Deviation (STD) and Angler Second Moment (ASM), are computed for each LBP feature, capturing additional statistical information about the local textures. Next, a multi-level LBP feature selection is applied to select the most relevant features. This process involves performing LBP computation at multiple scales or levels, capturing textures at different
... Show MoreIn this work, silver (Ag) self-metallization on a polyimide (PI) film was prepared through autocatalytic plating. PI films were prepared through the solution casting method, followed by etching with potassium hydroxide (KOH) solution, sensitization with tin chloride (SnCl2), and the use of palladium chloride (PdCl2) to activate the surface of PI. Energy-dispersive X-ray analysis (EDX) showed the highest peak in the (Ag) region and confirmed the presence of AgNPs. The diffraction peaks at 2θ = 38.2°, 44.5°, 64.6°, and 78.2° represented the 111, 200, 220, and 311 planes of Ag, respectively. The FT–IR an
... Show More