ERBM-SE: Extended Restricted Boltzmann Machine for Multi-Objective Single-Channel Speech Enhancement
Autor:
Khattak, Muhammad Irfan
; Saleem, Nasir
; Nawaz, Aamir
; Ahmed Almani, Aftab
; Umer, Farhana
; Verdú, Elena
Fecha:
06/2022Palabra clave:
Revista / editorial:
International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI)Tipo de Ítem:
articleDirección web:
https://www.ijimai.org/journal/bibcite/reference/3118Resumen:
Machine learning-based supervised single-channel speech enhancement has achieved considerable research interest over conventional approaches. In this paper, an extended Restricted Boltzmann Machine (RBM) is proposed for the spectral masking-based noisy speech enhancement. In conventional RBM, the acoustic features for the speech enhancement task are layerwise extracted and the feature compression may result in loss of vital information during the network training. In order to exploit the important information in the raw data, an extended RBM is proposed for the acoustic feature representation and speech enhancement. In the proposed RBM, the acoustic features are progressively extracted by multiple-stacked RBMs during the pre-training phase. The hidden acoustic features from the previous RBM are combined with the raw input data that serve as the new inputs to the present RBM. By adding the raw data to RBMs, the layer-wise features related to the raw data are progressively extracted, that is helpful to mine valuable information in the raw data. The results using the TIMIT database showed that the proposed method successfully attenuated the noise and gained improvements in the speech quality and intelligibility. The STOI, PESQ and SDR are improved by 16.86%, 25.01% and 3.84dB over the unprocessed noisy speech.
Ficheros en el ítem
Este ítem aparece en la(s) siguiente(s) colección(es)
Estadísticas de uso
Año |
2012 |
2013 |
2014 |
2015 |
2016 |
2017 |
2018 |
2019 |
2020 |
2021 |
2022 |
2023 |
2024 |
Vistas |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
19 |
70 |
85 |
Descargas |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
20 |
28 |
26 |
Ítems relacionados
Mostrando ítems relacionados por Título, autor o materia.
-
On improvement of speech intelligibility and quality: a survey of unsupervised single channel speech enhancement algorithms
Saleem, Nasir; Khattak, Muhammad Irfan; Verdú, Elena (International Journal of Interactive Multimedia and Artificial Intelligence, 06/2020)Many forms of human communication exist; for instance, text and nonverbal based. Speech is, however, the most powerful and dexterous form for the humans. Speech signals enable humans to communicate and this usefulness of ... -
On Improvement of Speech Intelligibility and Quality: A Survey of Unsupervised Single Channel Speech Enhancement Algorithms
Saleem, Nasir; Khattak, Muhammad Irfan; Verdú, Elena (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 06/2020)Many forms of human communication exist; for instance, text and nonverbal based. Speech is, however, the most powerful and dexterous form for the humans. Speech signals enable humans to communicate and this usefulness of ... -
Automated Detection of COVID-19 using Chest X-Ray Images and CT Scans through Multilayer-Spatial Convolutional Neural Networks
Khattak, Muhammad Irfan; Al-Hasan, Mu'ath; Jan, Atif; Saleem, Nasir; Verdú, Elena ; Khurshid, Numan (International Journal of Interactive Multimedia and Artificial Intelligence, 2021)The novel coronavirus-2019 (Covid-19), a contagious disease became a pandemic and has caused overwhelming effects on the human lives and world economy. The detection of the contagious disease is vital to avert further ...