Audio-Visual Automatic Speech Recognition Using PZM, MFCC and Statistical Analysis
Autor:
Debnath, Saswati
; Roy, Pinki
Fecha:
12/2021Palabra clave:
Revista / editorial:
International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI)Tipo de Ítem:
articleDirección web:
https://www.ijimai.org/journal/bibcite/reference/3012Resumen:
Audio-Visual Automatic Speech Recognition (AV-ASR) has become the most promising research area when the audio signal gets corrupted by noise. The main objective of this paper is to select the important and discriminative audio and visual speech features to recognize audio-visual speech. This paper proposes Pseudo Zernike Moment (PZM) and feature selection method for audio-visual speech recognition. Visual information is captured from the lip contour and computes the moments for lip reading. We have extracted 19th order of Mel Frequency Cepstral Coefficients (MFCC) as speech features from audio. Since all the 19 speech features are not equally important, therefore, feature selection algorithms are used to select the most efficient features. The various statistical algorithm such as Analysis of Variance (ANOVA), Kruskal-wallis, and Friedman test are employed to analyze the significance of features along with Incremental Feature Selection (IFS) technique. Statistical analysis is used to analyze the statistical significance of the speech features and after that IFS is used to select the speech feature subset. Furthermore, multiclass Support Vector Machine (SVM), Artificial Neural Network (ANN) and Naive Bayes (NB) machine learning techniques are used to recognize the speech for both the audio and visual modalities. Based on the recognition rate combined decision is taken from the two individual recognition systems. This paper compares the result achieved by the proposed model and the existing model for both audio and visual speech recognition. Zernike Moment (ZM) is compared with PZM and shows that our proposed model using PZM extracts better discriminative features for visual speech recognition. This study also proves that audio feature selection using statistical analysis outperforms methods without any feature selection technique.
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(es)
Estadísticas de uso
Año |
2012 |
2013 |
2014 |
2015 |
2016 |
2017 |
2018 |
2019 |
2020 |
2021 |
2022 |
2023 |
2024 |
Vistas |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
72 |
168 |
149 |
Descargas |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
89 |
165 |
245 |
Ítems relacionados
Mostrando ítems relacionados por Título, autor o materia.
-
PCHET: An efficient programmable cellular automata based hybrid encryption technique for multi-chat client-server applications
Roy, Satyabrata; Gupta, Rohit Kumar; Rawat, Umashankar; Dey, Nilanjan; González-Crespo, Rubén (Journal of Information Security and Applications, 12/2020)This paper demonstrates an efficient programmable Cellular Automata (CA) based hybrid encryption technique (PCHET) for chatting applications involving multiple clients who can chat simultaneously with each other. The ... -
The Impact of COVID-19 Management Policies Tailored to Airborne SARS-CoV-2 Transmission: Policy Analysis
Telles, Charles Roberto; Roy, Archisman; Ajmal, Mohammad Rehan; Mustafa, Syed Khalid; Ahmad, Mohammad Ayaz; De la Serna Tuya, Juan Moisés (JMIR public health and surveillance, 2021)Background: Daily new COVID-19 cases from January to April 2020 demonstrate varying patterns of SARS-CoV-2 transmission across different geographical regions. Constant infection rates were observed in some countries, whereas ... -
The TELE-DD project on treatment nonadherence in the population with type 2 diabetes and comorbid depression
Roy, Juan Francisco ; Lozano del Hoyo, Maria Luisa; Urcola-Pardo, Fernando; Monreal-Bartolome, Alicia; Gracia Ruiz, Diana Cecilia; Gómez Borao, María Mercedes; Artigas Alcazar, Ana Belen; Martinez Casbas, Jose Pedro; Aceituno Casas, Alexandra; Andaluz Funcia, María Teresa; Garcia-Campayo, Javier; Fernandez Rodrigo, María Teresa (Scientific reports, 2021)Diabetic patients have increased depression rates, diminished quality of life, and higher death rates due to depression comorbidity or diabetes complications. Treatment adherence (TA) and the maintenance of an adequate and ...