Beyond the immediate content of speech, the voice can provide rich information about a speaker's demographics, including age and gender. Estimating a speaker's age and gender offers a wide range of applications, spanning from voice forensic analysis to personalized advertising, healthcare monitoring, and human-computer interaction. However, pinpointing precise age remains intricate due to age ambiguity. Specifically, utterances from individuals at adjacent ages are frequently indistinguishable. Addressing this, we propose a novel, end-to-end approach that deploys Mozilla's Common Voice dataset to transform raw audio into high-quality feature representations using Wav2Vec2.0 embeddings. These are then channeled into our self-attention-based convolutional neural network (CNN) model. To address age ambiguity, we evaluate the effects of different loss functions such as focal loss and Kullback-Leibler (KL) divergence loss. Additionally, we evaluate the accuracy of the estimation at different durations of speech. Experimental results from the Common Voice dataset underscore the efficacy of our approach, showcasing an accuracy of 87% for male speakers, 91% for female speakers and 89% overall accuracy, and an accuracy of 99.1% for gender prediction.
General propositions have dealt with various indicators and features that frame and describe basic architectural concepts, and from those concepts, the concept of identity will be presented here, which represents the nerve of intellectual vision of the state of architecture development, transformation and change. Due to its deep intellectual basis, it was necessary to study multiple features, especially the achievement feature that was considered a major stage describing the nature of change and shift related to the achievement of concept and its role in the development of the architectural field . &nb
... Show MorePrinted Arabic document image retrieval is a very important and needed system for many companies, governments and various users. In this paper, a printed Arabic document images retrieval system based on spotting the header words of official Arabic documents is proposed. The proposed system uses an efficient segmentation, preprocessing methods and an accurate proposed feature extraction method in order to prepare the document for classification process. Besides that, Support Vector Machine (SVM) is used for classification. The experiments show the system achieved best results of accuracy that is 96.8% by using polynomial kernel of SVM classifier.
Humanity's relationship with the environment is a delicate balance. Since the industrial revolution, the world's population has grown at an exponential rate, and this has a major environmental effect. Deforestation, pollution, and global climate change are just a few of the negative consequences of population and technological growth. Particulates, Sulphur dioxide (SO2), and nitrogen oxides (NOx) are the primary pollutants that harm our health. These contaminants may be directly emitted into the atmosphere (primary pollutants) or formed in the atmosphere from primary pollutants reacting (secondary pollutants. Tropospheric ozone is created When water reacts with volatile organic compounds (VOC) and nitrogen oxides (NOx) in the presen
... Show MoreAutomatic speaker recognition may achieve remarkable performance in matched training and test conditions. Conversely, results drop significantly in incompatible noisy conditions. Furthermore, feature extraction significantly affects performance. Mel-frequency cepstral coefficients MFCCs are most commonly used in this field of study. The literature has reported that the conditions for training and testing are highly correlated. Taken together, these facts support strong recommendations for using MFCC features in similar environmental conditions (train/test) for speaker recognition. However, with noise and reverberation present, MFCC performance is not reliable. To address this, we propose a new feature 'entrocy' for accurate and robu
... Show MoreThis piece of research deals with assimilation as one of the phonological processes in the language. It is a trial to give more attention to this important process in English language with deep explanation to its counterpart in Arabic. in addition, this study sheds light on the points of similarities and differences concerning this process in the two languages. Assimilation in English means two sounds are involved, and one becomes more like the other.
The assimilating phoneme picks up one or more of the features of another nearby phoneme. The English phoneme /n/ has t
... Show MoreThis paper delves into the significant role played by local social and traditional structures in shaping Traditional Community Tenure (TCT) within Iraqi Land Tenure Legislation (ILTR), and examines their impact on gender inequalities, with a specific focus on women's land tenure rights. The methodological approach employed in this study identified the sources of barriers to gender equality within TCT as outlined in ILTR at two different bilateral levels, with input obtained from key stakeholders in a selected city in Iraq. The case study survey encompassed three districts, which served as local layers within the historic sectors of the Iraqi city of Al-Nasiriya. the study employed quantitative methods, including a household surveyو with
... Show MoreDiabetic retinopathy is an eye disease, because of pressure in eye nerve fiber. It is a major cause of blindness in middle as well as older age groups; therefore it is essential to diagnose it earlier. Some of the challenges are in the diagnosis of the disease is detection edges of the image, may be some important edges are missed outcome the noise around the corners.
Wherefore, in order to reduce these effects in this paper, we proposed a new technique for edge detection using traditional operators in combination with fuzzy logic based on fuzzy inference system. The results show that the proposed fuzzy edge detection technique better than of traditional techniques, where vascular are markedly detected over the original.
The charge density distributions (CDD) and the elastic electron scattering form
factors F(q) of the ground state for some odd mass nuclei in the 2s 1d shell, such
as K Mg Al Si 19 25 27 29 , , , and P 31
have been calculated based on the use of
occupation numbers of the states and the single particle wave functions of the
harmonic oscillator potential with size parameters chosen to reproduce the observed
root mean square charge radii for all considered nuclei. It is found that introducing
additional parameters, namely; 1 , and , 2 which reflect the difference of the
occupation numbers of the states from the prediction of the simple shell model leads
to very good agreement between the calculated an