Deep Neural Networks for Speech Enhancement in Complex-Noisy Environments
Autor:
Saleem, Nasir
; Khattak, Muhammad Irfan
Fecha:
03/2020Palabra clave:
Revista / editorial:
International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI)Tipo de Ítem:
articleDirección web:
https://www.ijimai.org/journal/bibcite/reference/2725Resumen:
In this paper, we considered the problem of the speech enhancement similar to the real-world environments where several complex noise sources simultaneously degrade the quality and intelligibility of a target speech. The existing literature on the speech enhancement principally focuses on the presence of one noise source in mixture signals. However, in real-world situations, we generally face and attempt to improve the quality and intelligibility of speech where various complex stationary and nonstationary noise sources are simultaneously mixed with the target speech. Here, we have used deep learning for speech enhancement in complex-noisy environments and used ideal binary mask (IBM) as a binary classification function by using deep neural networks (DNNs). IBM is used as a target function during training and the trained DNNs are used to estimate IBM during enhancement stage. The estimated target function is then applied to the complex-noisy mixtures to obtain the target speech. The mean square error (MSE) is used as an objective cost function at various epochs. The experimental results at different input signal-to-noise ratio (SNR) showed that DNN-based complex-noisy speech enhancement outperformed the competing methods in terms of speech quality by using perceptual evaluation of speech quality (PESQ), segmental signal-to-noise ratio (SNRSeg), log-likelihood ratio (LLR), weighted spectral slope (WSS). Moreover, short-time objective intelligibility (STOI) reinforced the better speech intelligibility.
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(es)
Estadísticas de uso
Año |
2012 |
2013 |
2014 |
2015 |
2016 |
2017 |
2018 |
2019 |
2020 |
2021 |
2022 |
2023 |
2024 |
Vistas |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
124 |
236 |
243 |
Descargas |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
182 |
228 |
167 |
Ítems relacionados
Mostrando ítems relacionados por Título, autor o materia.
-
On improvement of speech intelligibility and quality: a survey of unsupervised single channel speech enhancement algorithms
Saleem, Nasir; Khattak, Muhammad Irfan; Verdú, Elena (International Journal of Interactive Multimedia and Artificial Intelligence, 06/2020)Many forms of human communication exist; for instance, text and nonverbal based. Speech is, however, the most powerful and dexterous form for the humans. Speech signals enable humans to communicate and this usefulness of ... -
On Improvement of Speech Intelligibility and Quality: A Survey of Unsupervised Single Channel Speech Enhancement Algorithms
Saleem, Nasir; Khattak, Muhammad Irfan; Verdú, Elena (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 06/2020)Many forms of human communication exist; for instance, text and nonverbal based. Speech is, however, the most powerful and dexterous form for the humans. Speech signals enable humans to communicate and this usefulness of ... -
Automated Detection of COVID-19 using Chest X-Ray Images and CT Scans through Multilayer-Spatial Convolutional Neural Networks
Khattak, Muhammad Irfan; Al-Hasan, Mu'ath; Jan, Atif; Saleem, Nasir; Verdú, Elena ; Khurshid, Numan (International Journal of Interactive Multimedia and Artificial Intelligence, 2021)The novel coronavirus-2019 (Covid-19), a contagious disease became a pandemic and has caused overwhelming effects on the human lives and world economy. The detection of the contagious disease is vital to avert further ...