Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
The aim of this research is to collect the semantically restricted vocabulary from linguistic vocabulary and make it regular in one wire with an in-depth study. This study is important in detecting the exact meanings of the language. On the genre, as shown in this research, and our purpose to reveal this phenomenon, where it shows the accuracy of Arabic in denoting the meanings, the research has overturned more than sixty-seven words we extracted from the stomachs of the glossaries and books of language, and God ask safety intent and payment of opinion.
Hierarchical temporal memory (HTM) is a biomimetic sequence memory algorithm that holds promise for invariant representations of spatial and spatio-temporal inputs. This article presents a comprehensive neuromemristive crossbar architecture for the spatial pooler (SP) and the sparse distributed representation classifier, which are fundamental to the algorithm. There are several unique features in the proposed architecture that tightly link with the HTM algorithm. A memristor that is suitable for emulating the HTM synapses is identified and a new Z-window function is proposed. The architecture exploits the concept of synthetic synapses to enable potential synapses in the HTM. The crossbar for the SP avoids dark spots caused by unutil
... Show MoreIn information security, fingerprint verification is one of the most common recent approaches for verifying human identity through a distinctive pattern. The verification process works by comparing a pair of fingerprint templates and identifying the similarity/matching among them. Several research studies have utilized different techniques for the matching process such as fuzzy vault and image filtering approaches. Yet, these approaches are still suffering from the imprecise articulation of the biometrics’ interesting patterns. The emergence of deep learning architectures such as the Convolutional Neural Network (CNN) has been extensively used for image processing and object detection tasks and showed an outstanding performance compare
... Show MoreGender classification is a critical task in computer vision. This task holds substantial importance in various domains, including surveillance, marketing, and human-computer interaction. In this work, the face gender classification model proposed consists of three main phases: the first phase involves applying the Viola-Jones algorithm to detect facial images, which includes four steps: 1) Haar-like features, 2) Integral Image, 3) Adaboost Learning, and 4) Cascade Classifier. In the second phase, four pre-processing operations are employed, namely cropping, resizing, converting the image from(RGB) Color Space to (LAB) color space, and enhancing the images using (HE, CLAHE). The final phase involves utilizing Transfer lea
... Show MoreAutomatic speaker recognition may achieve remarkable performance in matched training and test conditions. Conversely, results drop significantly in incompatible noisy conditions. Furthermore, feature extraction significantly affects performance. Mel-frequency cepstral coefficients MFCCs are most commonly used in this field of study. The literature has reported that the conditions for training and testing are highly correlated. Taken together, these facts support strong recommendations for using MFCC features in similar environmental conditions (train/test) for speaker recognition. However, with noise and reverberation present, MFCC performance is not reliable. To address this, we propose a new feature 'entrocy' for accurate and robu
... Show MoreBackground: Chief complaint of patients attending dental clinic represents the first step towards treatment plan. However, most of patients are not aware but the extent and severity of periodontal disease, which could be also, misdiagnose by the dentist. Aim of the study: To investigate whether reported chief complaint(s) are consistent with oral hygiene status Materials and methods: Records of 1102 patients, attending periodontics clinics in the college of dentistry/ university of Baghdad, were used to determine ten most commonly reported chief complaints. Sample of patients was further subdivided according to gender and age. In addition, plaque and gingival index were recorded to determine oral hygiene status. Results: Patients mostly
... Show MoreIn the current worldwide health crisis produced by coronavirus disease (COVID-19), researchers and medical specialists began looking for new ways to tackle the epidemic. According to recent studies, Machine Learning (ML) has been effectively deployed in the health sector. Medical imaging sources (radiography and computed tomography) have aided in the development of artificial intelligence(AI) strategies to tackle the coronavirus outbreak. As a result, a classical machine learning approach for coronavirus detection from Computerized Tomography (CT) images was developed. In this study, the convolutional neural network (CNN) model for feature extraction and support vector machine (SVM) for the classification of axial
... Show MoreThis paper is devoted to investigate the effect of internal curing technique on the properties of self-compacting concrete (SCC). In this study, SCC is produced by using silica fume (SF) as partial replacement by weight of cement with percentage of (5%), sand is partially replaced by volume with saturated fine lightweight aggregate (LWA) which is thermostone chips as internal curing material in three percentages of (5%, 10% and 15%) for SCC, two external curing conditions water and air. The experimental work was divided into three parts: in the first part, the workability tests of fresh SCC were conducted. The second part included conducting compressive strength test and modulus of rupture test at ages of (7, 28 and 90). The third part i
... Show More