Audio classification is the process to classify different audio types according to contents. It is implemented in a large variety of real world problems, all classification applications allowed the target subjects to be viewed as a specific type of audio and hence, there is a variety in the audio types and every type has to be treatedcarefully according to its significant properties.Feature extraction is an important process for audio classification. This workintroduces several sets of features according to the type, two types of audio (datasets) were studied. Two different features sets are proposed: (i) firstorder gradient feature vector, and (ii) Local roughness feature vector, the experimentsshowed that the results are competitive to those gotten from other popular methods inthis field, such as Zero Crossing Rate (ZCR), Amplitude Descriptor (AD), Short Time Energy (STE), and Volume (Vo). The test results indicated, that the attained averageaccuracy of classification is improved up to94.9232% for training set and 95.8666%for testing set.The classification performance of these two extracted featuresets is studied individually, and then they used together as one feature set. Theiroverall performance is investigated, the test results showed that the proposed methods give high classification rates for the audio.
<span>Digital audio is required to transmit large sizes of audio information through the most common communication systems; in turn this leads to more challenges in both storage and archieving. In this paper, an efficient audio compressive scheme is proposed, it depends on combined transform coding scheme; it is consist of i) bi-orthogonal (tab 9/7) wavelet transform to decompose the audio signal into low & multi high sub-bands, ii) then the produced sub-bands passed through DCT to de-correlate the signal, iii) the product of the combined transform stage is passed through progressive hierarchical quantization, then traditional run-length encoding (RLE), iv) and finally LZW coding to generate the output mate bitstream.
... Show MoreThe speaker identification is one of the fundamental problems in speech processing and voice modeling. The speaker identification applications include authentication in critical security systems and the accuracy of the selection. Large-scale voice recognition applications are a major challenge. Quick search in the speaker database requires fast, modern techniques and relies on artificial intelligence to achieve the desired results from the system. Many efforts are made to achieve this through the establishment of variable-based systems and the development of new methodologies for speaker identification. Speaker identification is the process of recognizing who is speaking using the characteristics extracted from the speech's waves like pi
... Show MoreThe interests toward developing accurate automatic face emotion recognition methodologies are growing vastly, and it is still one of an ever growing research field in the region of computer vision, artificial intelligent and automation. However, there is a challenge to build an automated system which equals human ability to recognize facial emotion because of the lack of an effective facial feature descriptor and the difficulty of choosing proper classification method. In this paper, a geometric based feature vector has been proposed. For the classification purpose, three different types of classification methods are tested: statistical, artificial neural network (NN) and Support Vector Machine (SVM). A modified K-Means clustering algorithm
... Show MoreMany purposes require communicating audio files between the users using different applications of social media. The security level of these applications is limited; at the same time many audio files are secured and must be accessed by authorized persons only, while, most present works attempt to hide single audio file in certain cover media. In this paper, a new approach of hiding three audio signals with unequal sizes in single color digital image has been proposed using the frequencies transform of this image. In the proposed approach, the Fast Fourier Transform was adopted where each audio signal is embedded in specific region with high frequencies in the frequency spectrum of the cover image to sa
... Show MoreThe present study aims to identify wisdom-based thinking and its relationship to psychological capital. It further aims to find out the differences in the level of wisdom-based thinking and psychological capital according to the variables of gender and specialization (scientific, humanities). To achieve this, the study has been conducted on a sample of (380) male and female students. The two scales, wisdom-based thinking and psychological capital are implemented to the sample after being constructed by the researcher and after ensuring their psychometric characteristics' suitability for the study's aims. Results concerning the first aim have shown that there is a significant relationship among students. The second aim has revealed that t
... Show MoreThis abstract focuses on the significance of wireless body area networks (WBANs) as a cutting-edge and self-governing technology, which has garnered substantial attention from researchers. The central challenge faced by WBANs revolves around upholding quality of service (QoS) within rapidly evolving sectors like healthcare. The intricate task of managing diverse traffic types with limited resources further compounds this challenge. Particularly in medical WBANs, the prioritization of vital data is crucial to ensure prompt delivery of critical information. Given the stringent requirements of these systems, any data loss or delays are untenable, necessitating the implementation of intelligent algorithms. These algorithms play a pivota
... Show MoreDuring COVID-19, wearing a mask was globally mandated in various workplaces, departments, and offices. New deep learning convolutional neural network (CNN) based classifications were proposed to increase the validation accuracy of face mask detection. This work introduces a face mask model that is able to recognize whether a person is wearing mask or not. The proposed model has two stages to detect and recognize the face mask; at the first stage, the Haar cascade detector is used to detect the face, while at the second stage, the proposed CNN model is used as a classification model that is built from scratch. The experiment was applied on masked faces (MAFA) dataset with images of 160x160 pixels size and RGB color. The model achieve
... Show MoreIn this article, the research presents a general overview of deep learning-based AVSS (audio-visual source separation) systems. AVSS has achieved exceptional results in a number of areas, including decreasing noise levels, boosting speech recognition, and improving audio quality. The advantages and disadvantages of each deep learning model are discussed throughout the research as it reviews various current experiments on AVSS. The TCD TIMIT dataset (which contains top-notch audio and video recordings created especially for speech recognition tasks) and the Voxceleb dataset (a sizable collection of brief audio-visual clips with human speech) are just a couple of the useful datasets summarized in the paper that can be used to test A
... Show MoreTechnologies of the theatre show elements get great transferring concerning the creative embodiment of its aesthetic elements including its raws, forms, parts and masses in an attempt to achieve the prin cipal expressive progress to embody the main theme of the idea and the intended subject .
This is within the criterion of supporting the way to deal with technologies (décor elements , lighting music tones vocal affects , fashion and makeup) that achieving the emagintional appropriate atmosbheres which the writer and the director of the theatre show aim to make it present and succeed by furming active participation tunches of the ability of the cinogra . phic – element dsigners in order to invlve the theatre space atmospheres in cl