Deep learning convolution neural network has been widely used to recognize or classify voice. Various techniques have been used together with convolution neural network to prepare voice data before the training process in developing the classification model. However, not all model can produce good classification accuracy as there are many types of voice or speech. Classification of Arabic alphabet pronunciation is a one of the types of voice and accurate pronunciation is required in the learning of the Qur’an reading. Thus, the technique to process the pronunciation and training of the processed data requires specific approach. To overcome this issue, a method based on padding and deep learning convolution neural network is proposed to evaluate the pronunciation of the Arabic alphabet. Voice data from six school children are recorded and used to test the performance of the proposed method. The padding technique has been used to augment the voice data before feeding the data to the CNN structure to developed the classification model. In addition, three other feature extraction techniques have been introduced to enable the comparison of the proposed method which employs padding technique. The performance of the proposed method with padding technique is at par with the spectrogram but better than mel-spectrogram and mel-frequency cepstral coefficients. Results also show that the proposed method was able to distinguish the Arabic alphabets that are difficult to pronounce. The proposed method with padding technique may be extended to address other voice pronunciation ability other than the Arabic alphabets.
Data-driven models perform poorly on part-of-speech tagging problems with the square Hmong language, a low-resource corpus. This paper designs a weight evaluation function to reduce the influence of unknown words. It proposes an improved harmony search algorithm utilizing the roulette and local evaluation strategies for handling the square Hmong part-of-speech tagging problem. The experiment shows that the average accuracy of the proposed model is 6%, 8% more than HMM and BiLSTM-CRF models, respectively. Meanwhile, the average F1 of the proposed model is also 6%, 3% more than HMM and BiLSTM-CRF models, respectively.
Arabic calligraphy is one of the ancient arts rooted in history, And that he grew up conflicting views and writings addressed as a, communication tool for the linguistic The teaching calligraphy note an art and science because it depends on the fixed assets and precise rules in his art because centered Beauty It targets teach Arabic calligraphy speed as the education and recitation helps to write fast Which have great interest in the field of education and in life both Also accompanied Arabic calligraphy and scientific renaissance significant knowledge in the Ara
... Show MoreThis study has been developed axes of the search, including: Search (deliberative) language and idiomatically, and Description Language (b social phenomenon), and the definition of the theory of (acts of speech), and discussed the problem of the conflict between tradition and innovation, as defined objectively have a target aimed at reviving the deliberative thought when Arab scholars , and the balance between the actual done Arab and Western rhetoric, but Meet in intellectual necessity, a sober reading that preserve the Arab language prestige, and its position in the light of the growing tongue Sciences, as long as we have inherited minds unique, and heritage huge able to consolidate the Arab theory lingual in linguistics.
Speech is the essential way to interact between humans or between human and machine. However, it is always contaminated with different types of environment noise. Therefore, speech enhancement algorithms (SEA) have appeared as a significant approach in speech processing filed to suppress background noise and return back the original speech signal. In this paper, a new efficient two-stage SEA with low distortion is proposed based on minimum mean square error sense. The estimation of clean signal is performed by taking the advantages of Laplacian speech and noise modeling based on orthogonal transform (Discrete Krawtchouk-Tchebichef transform) coefficients distribution. The Discrete Kra
A three-stage learning algorithm for deep multilayer perceptron (DMLP) with effective weight initialisation based on sparse auto-encoder is proposed in this paper, which aims to overcome difficulties in training deep neural networks with limited training data in high-dimensional feature space. At the first stage, unsupervised learning is adopted using sparse auto-encoder to obtain the initial weights of the feature extraction layers of the DMLP. At the second stage, error back-propagation is used to train the DMLP by fixing the weights obtained at the first stage for its feature extraction layers. At the third stage, all the weights of the DMLP obtained at the second stage are refined by error back-propagation. Network structures an
... Show MoreIn this work , a hybrid scheme tor Arabic speech for the recognition
of the speaker verification is presented . The scheme is hybrid as utilizes the traditional digi tal signal processi ng and neural network . Kohonen neural network has been used as a recognizer tor speaker verification after extract spectral features from an acoustic signal by Fast Fourier Transformation Algorithm(FFT) .
The system was im plemented using a PENTIUM processor , I000
MHZ compatible and MS-dos 6.2 .
Most recent studies have focused on using modern intelligent techniques spatially, such as those
developed in the Intruder Detection Module (IDS). Such techniques have been built based on modern
artificial intelligence-based modules. Those modules act like a human brain. Thus, they should have had the
ability to learn and recognize what they had learned. The importance of developing such systems came after
the requests of customers and establishments to preserve their properties and avoid intruders’ damage. This
would be provided by an intelligent module that ensures the correct alarm. Thus, an interior visual intruder
detection module depending on Multi-Connect Architecture Associative Memory (MCA)
Natural gas and oil are one of the mainstays of the global economy. However, many issues surround the pipelines that transport these resources, including aging infrastructure, environmental impacts, and vulnerability to sabotage operations. Such issues can result in leakages in these pipelines, requiring significant effort to detect and pinpoint their locations. The objective of this project is to develop and implement a method for detecting oil spills caused by leaking oil pipelines using aerial images captured by a drone equipped with a Raspberry Pi 4. Using the message queuing telemetry transport Internet of Things (MQTT IoT) protocol, the acquired images and the global positioning system (GPS) coordinates of the images' acquisition are
... Show MoreThe emergence of SARS-CoV-2, the virus responsible for the COVID-19 pandemic, has resulted in a global health crisis leading to widespread illness, death, and daily life disruptions. Having a vaccine for COVID-19 is crucial to controlling the spread of the virus which will help to end the pandemic and restore normalcy to society. Messenger RNA (mRNA) molecules vaccine has led the way as the swift vaccine candidate for COVID-19, but it faces key probable restrictions including spontaneous deterioration. To address mRNA degradation issues, Stanford University academics and the Eterna community sponsored a Kaggle competition.This study aims to build a deep learning (DL) model which will predict deterioration rates at each base of the mRNA
... Show MoreInformation processing has an important application which is speech recognition. In this paper, a two hybrid techniques have been presented. The first one is a 3-level hybrid of Stationary Wavelet Transform (S) and Discrete Wavelet Transform (W) and the second one is a 3-level hybrid of Discrete Wavelet Transform (W) and Multi-wavelet Transforms (M). To choose the best 3-level hybrid in each technique, a comparison according to five factors has been implemented and the best results are WWS, WWW, and MWM. Speech recognition is performed on WWS, WWW, and MWM using Euclidean distance (Ecl) and Dynamic Time Warping (DTW). The match performance is (98%) using DTW in MWM, while in the WWS and WWW are (74%) and (78%) respectively, but when using (
... Show More