Compressing the speech reduces the data storage requirements, leading to reducing the time of transmitting the digitized speech over long-haul links like internet. To obtain best performance in speech compression, wavelet transforms require filters that combine a number of desirable properties, such as orthogonality and symmetry.The MCT bases functions are derived from GHM bases function using 2D linear convolution .The fast computation algorithm methods introduced here added desirable features to the current transform. We further assess the performance of the MCT in speech compression application. This paper discusses the effect of using DWT and MCT (one and two dimension) on speech compression. DWT and MCT performances in terms of compression ratio (CR), mean square error (MSE) and peak signal to noise ratio (PSNR) are assessed. Computer simulation results indicate that the two dimensions MCT offer a better compression ratio, MSE and PSNR than DWT.
Home New Trends in Information and Communications Technology Applications Conference paper Audio Compression Using Transform Coding with LZW and Double Shift Coding Zainab J. Ahmed & Loay E. George Conference paper First Online: 11 January 2022 126 Accesses Part of the Communications in Computer and Information Science book series (CCIS,volume 1511) Abstract The need for audio compression is still a vital issue, because of its significance in reducing the data size of one of the most common digital media that is exchanged between distant parties. In this paper, the efficiencies of two audio compression modules were investigated; the first module is based on discrete cosine transform and the second module is based on discrete wavelet tr
... Show MoreDigital image is widely used in computer applications. This paper introduces a proposed method of image zooming based upon inverse slantlet transform and image scaling. Slantlet transform (SLT) is based on the principle of designing different filters for different scales.
First we apply SLT on color image, the idea of transform color image into slant, where large coefficients are mainly the signal and smaller one represent the noise. By suitably modifying these coefficients , using scaling up image by box and Bartlett filters so that the image scales up to 2X2 and then inverse slantlet transform from modifying coefficients using to the reconstructed image .
&nbs
... Show MoreThis paper is concerned with combining two different transforms to present a new joint transform FHET and its inverse transform IFHET. Also, the most important property of FHET was concluded and proved, which is called the finite Hankel – Elzaki transforms of the Bessel differential operator property, this property was discussed for two different boundary conditions, Dirichlet and Robin. Where the importance of this property is shown by solving axisymmetric partial differential equations and transitioning to an algebraic equation directly. Also, the joint Finite Hankel-Elzaki transform method was applied in solving a mathematical-physical problem, which is the Hotdog Problem. A steady state which does not depend on time was discussed f
... Show MoreThis paper presents the application of a framework of fast and efficient compressive sampling based on the concept of random sampling of sparse Audio signal. It provides four important features. (i) It is universal with a variety of sparse signals. (ii) The number of measurements required for exact reconstruction is nearly optimal and much less then the sampling frequency and below the Nyquist frequency. (iii) It has very low complexity and fast computation. (iv) It is developed on the provable mathematical model from which we are able to quantify trade-offs among streaming capability, computation/memory requirement and quality of reconstruction of the audio signal. Compressed sensing CS is an attractive compression scheme due to its uni
... Show MoreDeep learning convolution neural network has been widely used to recognize or classify voice. Various techniques have been used together with convolution neural network to prepare voice data before the training process in developing the classification model. However, not all model can produce good classification accuracy as there are many types of voice or speech. Classification of Arabic alphabet pronunciation is a one of the types of voice and accurate pronunciation is required in the learning of the Qur’an reading. Thus, the technique to process the pronunciation and training of the processed data requires specific approach. To overcome this issue, a method based on padding and deep learning convolution neural network is proposed to
... Show MoreThis paper aims at studying the illocutionary speech acts: direct and indirect to show the most dominant ones in a presidential speech delivered by the USA president. The speech is about the most critical health issue in the world, COVID-19 outbreak. A descriptive qualitative study was conducted by observing the first speech delivered by president Trump concerning coronavirus outbreak and surveying the illocutionary acts: directive, declarative, commissive, expressive, and representative. Searle's (1985) classification of illocutionary speech acts is adopted in the analysis.
What are the main types of the illocutionary speech acts performed by Trump in his speech?; Why does
... Show MoreThe speech recognition system has been widely used by many researchers using different
methods to fulfill a fast and accurate system. Speech signal recognition is a typical
classification problem, which generally includes two main parts: feature extraction and
classification. In this paper, a new approach to achieve speech recognition task is proposed by
using transformation techniques for feature extraction methods; namely, slantlet transform
(SLT), discrete wavelet transforms (DWT) type Daubechies Db1 and Db4. Furthermore, a
modified artificial neural network (ANN) with dynamic time warping (DTW) algorithm is
developed to train a speech recognition system to be used for classification and recognition
purposes. T
Data-driven models perform poorly on part-of-speech tagging problems with the square Hmong language, a low-resource corpus. This paper designs a weight evaluation function to reduce the influence of unknown words. It proposes an improved harmony search algorithm utilizing the roulette and local evaluation strategies for handling the square Hmong part-of-speech tagging problem. The experiment shows that the average accuracy of the proposed model is 6%, 8% more than HMM and BiLSTM-CRF models, respectively. Meanwhile, the average F1 of the proposed model is also 6%, 3% more than HMM and BiLSTM-CRF models, respectively.