Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
Deep learning techniques are used across a wide range of fields for several applications. In recent years, deep learning-based object detection from aerial or terrestrial photos has gained popularity as a study topic. The goal of object detection in computer vision is to anticipate the presence of one or more objects, along with their classes and bounding boxes. The YOLO (You Only Look Once) modern object detector can detect things in real-time with accuracy and speed. A neural network from the YOLO family of computer vision models makes one-time predictions about the locations of bounding rectangles andclassification probabilities for an image. In layman's terms, it is a technique for instantly identifying and rec
... Show MoreResearch was: 1- known as self-efficacy when students perceived the university. 2- know the significance of statistical differences in perceived self-efficacy according to gender and specialty. Formed the research sample of (300) students were chosen from the original research community by way of random (150) male specialization and scientific and humanitarian (150) females specialized scientific and humanitarian. The search tool to prepare the yard tool to measure perceived self-efficacy based on measurements and previous literature on the subject of perceived self-efficacy. The researcher using a number of means, statistical, including test Altaúa and analysis of variance of bilateral and results showed the enjoyment of the research s
... Show MoreSpeech recognition is a very important field that can be used in many applications such as controlling to protect area, banking, transaction over telephone network database access service, voice email, investigations, House controlling and management ... etc. Speech recognition systems can be used in two modes: to identify a particular person or to verify a person’s claimed identity. The family speaker recognition is a modern field in the speaker recognition. Many family speakers have similarity in the characteristics and hard to identify between them. Today, the scope of speech recognition is limited to speech collected from cooperative users in real world office environments and without adverse microphone or channel impairments.
Eye Detection is used in many applications like pattern recognition, biometric, surveillance system and many other systems. In this paper, a new method is presented to detect and extract the overall shape of one eye from image depending on two principles Helmholtz & Gestalt. According to the principle of perception by Helmholz, any observed geometric shape is perceptually "meaningful" if its repetition number is very small in image with random distribution. To achieve this goal, Gestalt Principle states that humans see things either through grouping its similar elements or recognize patterns. In general, according to Gestalt Principle, humans see things through genera
... Show MoreClinical keratoconus (KCN) detection is a challenging and time-consuming task. In the diagnosis process, ophthalmologists must revise demographic and clinical ophthalmic examinations. The latter include slit-lamb, corneal topographic maps, and Pentacam indices (PI). We propose an Ensemble of Deep Transfer Learning (EDTL) based on corneal topographic maps. We consider four pretrained networks, SqueezeNet (SqN), AlexNet (AN), ShuffleNet (SfN), and MobileNet-v2 (MN), and fine-tune them on a dataset of KCN and normal cases, each including four topographic maps. We also consider a PI classifier. Then, our EDTL method combines the output probabilities of each of the five classifiers to obtain a decision b
Spraying pesticides is one of the most common procedures that is conducted to control pests. However, excessive use of these chemicals inversely affects the surrounding environments including the soil, plants, animals, and the operator itself. Therefore, researchers have been encouraged to...
Objective(s): To evaluate teachers’ performance of counseling for pupils with Attention Deficit and Hyperactivity Disorder, to identify the relationship between Teachers’ Performance of Counselling for Pupils with Attention Deficit and Hyperactivity Disorder and their demographic.
Methodology: A quasi-experimental (pre-posttest) design was carried out to evaluate teachers’ performance of counseling for pupils with Attention Deficit and Hyperactivity Disorder, at Al-Firdous mixed primary School and to find out the association between teachers' performance about Attention Deficit and Hyperactivity Disorder and their socio-demographic characteristic. The study was started from 18th September 2
... Show MoreThe growth of developments in machine learning, the image processing methods along with availability of the medical imaging data are taking a big increase in the utilization of machine learning strategies in the medical area. The utilization of neural networks, mainly, in recent days, the convolutional neural networks (CNN), have powerful descriptors for computer added diagnosis systems. Even so, there are several issues when work with medical images in which many of medical images possess a low-quality noise-to-signal (NSR) ratio compared to scenes obtained with a digital camera, that generally qualified a confusingly low spatial resolution and tends to make the contrast between different tissues of body are very low and it difficult to co
... Show Morebackground: osteoporosis is a metabolic bone disease that affects women more than men, it is characterized by generalizes reduction of bone mineral density (BMD) leaving a fragile weak bone that is liable to fracture, gonial angle index (GAI) is one of the radio-morphometric indices, it has been controversial whether it is related to bone mineral density or ageing or none of them. The aim of study is to evaluate the role of cone beam computed tomography (CBCT) as a screening tool for diagnosis of osteoporosis and age effect in females using gonial angle index. Material and method: 60 females were divided into 3 groups according to age and (BMD) status into: Group1 (non-osteoporosis 20-30 years), Group2 (non-osteoporosis 50years and above),
... Show More