Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
Classifying an overlapping object is one of the main challenges faced by researchers who work in object detection and recognition. Most of the available algorithms that have been developed are only able to classify or recognize objects which are either individually separated from each other or a single object in a scene(s), but not overlapping kitchen utensil objects. In this project, Faster R-CNN and YOLOv5 algorithms were proposed to detect and classify an overlapping object in a kitchen area. The YOLOv5 and Faster R-CNN were applied to overlapping objects where the filter or kernel that are expected to be able to separate the overlapping object in the dedicated layer of applying models. A kitchen utensil benchmark image database and
... Show MoreIn the present paper, three reliable iterative methods are given and implemented to solve the 1D, 2D and 3D Fisher’s equation. Daftardar-Jafari method (DJM), Temimi-Ansari method (TAM) and Banach contraction method (BCM) are applied to get the exact and numerical solutions for Fisher's equations. The reliable iterative methods are characterized by many advantages, such as being free of derivatives, overcoming the difficulty arising when calculating the Adomian polynomial boundaries to deal with nonlinear terms in the Adomian decomposition method (ADM), does not request to calculate Lagrange multiplier as in the Variational iteration method (VIM) and there is no need to create a homotopy like in the Homotopy perturbation method (H
... Show MoreCompressing the speech reduces the data storage requirements, leading to reducing the time of transmitting the digitized speech over long-haul links like internet. To obtain best performance in speech compression, wavelet transforms require filters that combine a number of desirable properties, such as orthogonality and symmetry.The MCT bases functions are derived from GHM bases function using 2D linear convolution .The fast computation algorithm methods introduced here added desirable features to the current transform. We further assess the performance of the MCT in speech compression application. This paper discusses the effect of using DWT and MCT (one and two dimension) on speech compression. DWT and MCT performances in terms of comp
... Show MoreIn this paper two main stages for image classification has been presented. Training stage consists of collecting images of interest, and apply BOVW on these images (features extraction and description using SIFT, and vocabulary generation), while testing stage classifies a new unlabeled image using nearest neighbor classification method for features descriptor. Supervised bag of visual words gives good result that are present clearly in the experimental part where unlabeled images are classified although small number of images are used in the training process.
Background: MicroRNAs (miRNAs) are small noncoding RNAs that postâ€transcriptionally regulate gene expression by targeting specific mRNAs. The main objective of this study was measure the level of salivary (hsa-miR-200a, hsa-miR-125a and hsa- miR-93) in both oral squamous cell carcinoma and healthy controls to asses the association of them with age, gender and tumor grade materials and methods The level of three salivary microRNAs namely hsa-miR-200a, hsa-miR-125a and hsa- miR-93 were measured in saliva of patients with oral squamous cell carcinoma and healthy controls by using reveres transcription, preamplification and quantitative PCR also the general information from each patient including the age, sex and tumor grade were record
... Show MoreType 2 daibetes mellitus (T2DM) is a global concern boosted by both population growth and ageing, the majority of affected people are aged between (40- 59 year). The objective of this research was to estimate the impact of age and gender on glycaemic control parameters: Fasting blood glucose (FBC), glycated hemoglobin (HbA1C), insulin, insulin resistance (IR) and insulin sensitivity (IS), renal function parameters: urea, creatinine and oxidative stress parameters: total antioxidant capacity (TAC) and reactive oxygen species (ROS). Eighty-one random samples of T2DM patients (35 men and 46 women) were included in this study, their average age was 52.75±9.63 year. Current study found that FBG, HbA1C and IR were highly significant (P<0.01) inc
... Show More