Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
A new algorithm is proposed to compress speech signals using wavelet transform and linear predictive coding. Signal compression based on the concept of selecting a small number of approximation coefficients after they are compressed by the wavelet decomposition (Haar and db4) at a suitable chosen level and ignored details coefficients, and then approximation coefficients are windowed by a rectangular window and fed to the linear predictor. Levinson Durbin algorithm is used to compute LP coefficients, reflection coefficients and predictor error. The compress files contain LP coefficients and previous sample. These files are very small in size compared to the size of the original signals. Compression ratio is calculated from the size of th
... Show MoreMany problems were encountered during the drilling operations in Zubair oilfield. Stuckpipe, wellbore instability, breakouts and washouts, which increased the critical limits problems, were observed in many wells in this field, therefore an extra non-productive time added to the total drilling time, which will lead to an extra cost spent. A 1D Mechanical Earth Model (1D MEM) was built to suggest many solutions to such types of problems. An overpressured zone is noticed and an alternative mud weigh window is predicted depending on the results of the 1D MEM. Results of this study are diagnosed and wellbore instability problems are predicted in an efficient way using the 1D MEM. Suitable alternative solutions are presented
... Show MoreMost recent studies have focused on using modern intelligent techniques spatially, such as those
developed in the Intruder Detection Module (IDS). Such techniques have been built based on modern
artificial intelligence-based modules. Those modules act like a human brain. Thus, they should have had the
ability to learn and recognize what they had learned. The importance of developing such systems came after
the requests of customers and establishments to preserve their properties and avoid intruders’ damage. This
would be provided by an intelligent module that ensures the correct alarm. Thus, an interior visual intruder
detection module depending on Multi-Connect Architecture Associative Memory (MCA)
Abstract
This study aims to identify the empathy of University Students, as well as the significant differences in sympathy in terms of gender and specialization. To achieve the aims of the study, a scale of empathy was administered to a sample of (450) students collected randomly from Baghdad university. The results showed that the study sample has a level of empathy. There is a significant difference between males and females in empathy, in favor of the female students. There is no significant difference in empathy in terms of specialization (scientific, humanities), and the interaction between males and females. The study came out with a number of recommendations and suggestions.
In this golden age of rapid development surgeons realized that AI could contribute to healthcare in all aspects, especially in surgery. The aim of the study will incorporate the use of Convolutional Neural Network and Constrained Local Models (CNN-CLM) which can make improvement for the assessment of Laparoscopic Cholecystectomy (LC) surgery not only bring opportunities for surgery but also bring challenges on the way forward by using the edge cutting technology. The problem with the current method of surgery is the lack of safety and specific complications and problems associated with safety in each laparoscopic cholecystectomy procedure. When CLM is utilize into CNN models, it is effective at predicting time series tasks like iden
... Show MoreMethods of speech recognition have been the subject of several studies over the past decade. Speech recognition has been one of the most exciting areas of the signal processing. Mixed transform is a useful tool for speech signal processing; it is developed for its abilities of improvement in feature extraction. Speech recognition includes three important stages, preprocessing, feature extraction, and classification. Recognition accuracy is so affected by the features extraction stage; therefore different models of mixed transform for feature extraction were proposed. The properties of the recorded isolated word will be 1-D, which achieve the conversion of each 1-D word into a 2-D form. The second step of the word recognizer requires, the
... Show MoreRecognizing speech emotions is an important subject in pattern recognition. This work is about studying the effect of extracting the minimum possible number of features on the speech emotion recognition (SER) system. In this paper, three experiments performed to reach the best way that gives good accuracy. The first one extracting only three features: zero crossing rate (ZCR), mean, and standard deviation (SD) from emotional speech samples, the second one extracting only the first 12 Mel frequency cepstral coefficient (MFCC) features, and the last experiment applying feature fusion between the mentioned features. In all experiments, the features are classified using five types of classification techniques, which are the Random Forest (RF),
... Show MoreElastic electron scattering form factors, charge density distributions and charge,neutron and matter root mean square (rms) radii for P24PMg, P28PSi and P32PS nuclei arestudied using the effect of occupation numbers. Single-particle radial wave functionsof harmonic-oscillators (HO) potential are used. In general, the results of elasticcharge form factors showed good agreement with experimental data. The occupationnumbers are taken to reproduce the quantities mentioned above. The inclusion ofoccupation numbers enhances the form factors to become closer to the data. For thecalculated charge density distributions, the results show good agreement withexperimental data except the fail to produce the hump in the central region for P28PSinucleus.
... Show More