Voice Activity Detection (VAD) is considered as an important pre-processing step in speech processing systems such as speech enhancement, speech recognition, gender and age identification. VAD helps in reducing the time required to process speech data and to improve final system accuracy by focusing the work on the voiced part of the speech. An automatic technique for VAD using Fuzzy-Neuro technique (FN-AVAD) is presented in this paper. The aim of this work is to alleviate the problem of choosing the best threshold value in traditional VAD methods and achieves automaticity by combining fuzzy clustering and machine learning techniques. Four features are extracted from each speech segment, which are short term energy, zero-crossing rate, autocorrelation, and log energy. A modified version of fuzzy C-Means is then used to cluster speech segments into three clusters; two clusters for voice and one for unvoiced. After that, three feed forward neural networks are trained to adjust their weights, in which each network represents one cluster. To make the final decision regarding the class type of a given speech segment, the membership degrees of this segment in all clusters along with neural networks' decisions are given to a defuzzification step which finally gives the class type of that segment. The proposed FN-AVAD is tested on the public multimodal emotion database, Surrey AudioVisual Expressed Emotion (SAVEE), and the error rate was 2.08%. The achieved results are comparable to the results achieved by the current published works in the literature.
Background: Sprite coding is a very effective technique for clarifying the background video object. The sprite generation is an open issue because of the foreground objects which prevent the precision of camera motion estimation and blurs the created sprite. Objective: In this paper, a quick and basic static method for sprite area detection in video data is presented. Two statistical methods are applied; the mean and standard deviation of every pixel (over all group of video frame) to determine whether the pixel is a piece of the selected static sprite range or not. A binary map array is built for demonstrating the allocated sprite (as 1) while the non-sprite (as 0) pixels valued. Likewise, holes and gaps filling strategy was utilized to re
... Show MoreA strong sign language recognition system can break down the barriers that separate hearing and speaking members of society from speechless members. A novel fast recognition system with low computational cost for digital American Sign Language (ASL) is introduced in this research. Different image processing techniques are used to optimize and extract the shape of the hand fingers in each sign. The feature extraction stage includes a determination of the optimal threshold based on statistical bases and then recognizing the gap area in the zero sign and calculating the heights of each finger in the other digits. The classification stage depends on the gap area in the zero signs and the number of opened fingers in the other signs as well as
... Show MoreAnomaly detection is still a difficult task. To address this problem, we propose to strengthen DBSCAN algorithm for the data by converting all data to the graph concept frame (CFG). As is well known that the work DBSCAN method used to compile the data set belong to the same species in a while it will be considered in the external behavior of the cluster as a noise or anomalies. It can detect anomalies by DBSCAN algorithm can detect abnormal points that are far from certain set threshold (extremism). However, the abnormalities are not those cases, abnormal and unusual or far from a specific group, There is a type of data that is do not happen repeatedly, but are considered abnormal for the group of known. The analysis showed DBSCAN using the
... Show MoreAs performers in a social world, we communicate with other people by sharing information on many different levels. Each utterance includes linguistic information and conveys much information about the speaker’s identity. Variation in voice quality indexes information about the speaker and marks the speaker’s identity as a unique individual. The present study aims to validate the belief that each individual has an inalienable voice print that can’t be imitated. The study verifies that, even the more similar personality between two individuals, or the close position in society the variance is voice quality. The acoustic analysis is performed via analysing the acoustic parameters namely: the fundamental frequency, amplitude, inten
... Show Moreان الغرض من هذا البحث هو المزج بين القيود الضبابية والاحتمالية. كما يهدف الى مناقشة اكثر حالات مشكلات البرمجة الضبابية شيوعا وهي عندما تكون المشكلة الضبابية تتبع دالة الانتماء مرة دالة الاتنماء المثلثية مرة اخرى، من خلال التطبيق العملي والتجريبي. فضلا عن توظيف البرمجة الخطية الضبابية في معالجة مشكلات تخطيط وجدولة الإنتاج لشركة العراق لصناعة الأثاث، وكذلك تم استخدام الطرائق الكمية للتنبؤ بالطلب واعتماده
... Show MoreThe concept of fuzzy orbit open sets under the mapping
Within that research, we introduce fibrewise fuzzy types of the most important separation axioms in ordinary fuzz topology, namely fibrewise fuzzy (T 0 spaces, T 1 spaces, R 0 spaces, Hausdorff spaces, functionally Hausdorff spaces, regular spaces, completely regular spaces, normal spaces, and normal spaces). Too we add numerous outcomes about it.
In this research, the problem of multi- objective modal transport was formulated with mixed constraints to find the optimal solution. The foggy approach of the Multi-objective Transfer Model (MOTP) was applied. There are three objectives to reduce costs to the minimum cost of transportation, administrative cost and cost of the goods. The linear membership function, the Exponential membership function, and the Hyperbolic membership function. Where the proposed model was used in the General Company for the manufacture of grain to reduce the cost of transport to the minimum and to find the best plan to transfer the product according to the restrictions imposed on the model.