Recognizing speech emotions is an important subject in pattern recognition. This work is about studying the effect of extracting the minimum possible number of features on the speech emotion recognition (SER) system. In this paper, three experiments performed to reach the best way that gives good accuracy. The first one extracting only three features: zero crossing rate (ZCR), mean, and standard deviation (SD) from emotional speech samples, the second one extracting only the first 12 Mel frequency cepstral coefficient (MFCC) features, and the last experiment applying feature fusion between the mentioned features. In all experiments, the features are classified using five types of classification techniques, which are the Random Forest (RF),
... Show MoreMethods of speech recognition have been the subject of several studies over the past decade. Speech recognition has been one of the most exciting areas of the signal processing. Mixed transform is a useful tool for speech signal processing; it is developed for its abilities of improvement in feature extraction. Speech recognition includes three important stages, preprocessing, feature extraction, and classification. Recognition accuracy is so affected by the features extraction stage; therefore different models of mixed transform for feature extraction were proposed. The properties of the recorded isolated word will be 1-D, which achieve the conversion of each 1-D word into a 2-D form. The second step of the word recognizer requires, the
... Show MoreVisual media is a better way to deliver the information than the old way of "reading". For that reason with the wide propagation of multimedia websites, there are large video library’s archives, which came to be a main resource for humans. This research puts its eyes on the existing development in applying classical phrase search methods to a linked vocal transcript and after that it retrieves the video, this an easier way to search any visual media. This system has been implemented using JSP and Java language for searching the speech in the videos
Feature extraction provide a quick process for extracting object from remote sensing data (images) saving time to urban planner or GIS user from digitizing hundreds of time by hand. In the present work manual, rule based, and classification methods have been applied. And using an object- based approach to classify imagery. From the result, we obtained that each method is suitable for extraction depending on the properties of the object, for example, manual method is convenient for object, which is clear, and have sufficient area, also choosing scale and merge level have significant effect on the classification process and the accuracy of object extraction. Also from the results the rule-based method is more suitable method for extracting
... Show MoreSpeech recognition is a very important field that can be used in many applications such as controlling to protect area, banking, transaction over telephone network database access service, voice email, investigations, House controlling and management ... etc. Speech recognition systems can be used in two modes: to identify a particular person or to verify a person’s claimed identity. The family speaker recognition is a modern field in the speaker recognition. Many family speakers have similarity in the characteristics and hard to identify between them. Today, the scope of speech recognition is limited to speech collected from cooperative users in real world office environments and without adverse microphone or channel impairments.
The speech recognition system has been widely used by many researchers using different
methods to fulfill a fast and accurate system. Speech signal recognition is a typical
classification problem, which generally includes two main parts: feature extraction and
classification. In this paper, a new approach to achieve speech recognition task is proposed by
using transformation techniques for feature extraction methods; namely, slantlet transform
(SLT), discrete wavelet transforms (DWT) type Daubechies Db1 and Db4. Furthermore, a
modified artificial neural network (ANN) with dynamic time warping (DTW) algorithm is
developed to train a speech recognition system to be used for classification and recognition
purposes. T
Information processing has an important application which is speech recognition. In this paper, a two hybrid techniques have been presented. The first one is a 3-level hybrid of Stationary Wavelet Transform (S) and Discrete Wavelet Transform (W) and the second one is a 3-level hybrid of Discrete Wavelet Transform (W) and Multi-wavelet Transforms (M). To choose the best 3-level hybrid in each technique, a comparison according to five factors has been implemented and the best results are WWS, WWW, and MWM. Speech recognition is performed on WWS, WWW, and MWM using Euclidean distance (Ecl) and Dynamic Time Warping (DTW). The match performance is (98%) using DTW in MWM, while in the WWS and WWW are (74%) and (78%) respectively, but when using (
... Show More