• Mi Re-Unir
    Búsqueda Avanzada
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    Ver ítem 
    •   Inicio
    • UNIR REVISTAS
    • Revista IJIMAI
    • 2022
    • vol. 7, nº 7, december 2022
    • Ver ítem
    •   Inicio
    • UNIR REVISTAS
    • Revista IJIMAI
    • 2022
    • vol. 7, nº 7, december 2022
    • Ver ítem

    Modeling Sub-Band Information Through Discrete Wavelet Transform to Improve Intelligibility Assessment of Dysarthric Speech

    Autor: 
    Sahu, Laxmi Priya
    ;
    Pradhan, Gayadhar
    ;
    Singh, Jyoti Prakash
    Fecha: 
    12/2022
    Palabra clave: 
    approximation coefficient; cepstral coefficients; detail coefficient; dysarthria; signal; discrete wavelet transforms; IJIMAI
    Revista / editorial: 
    International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI)
    Tipo de Ítem: 
    article
    URI: 
    https://reunir.unir.net/handle/123456789/13933
    DOI: 
    https://doi.org/10.9781/ijimai.2022.10.003
    Dirección web: 
    https://ijimai.org/journal/bibcite/reference/3196
    Open Access
    Resumen:
    The speech signal within a sub-band varies at a fine level depending on the type, and level of dysarthria. The Mel-frequency filterbank used in the computation process of cepstral coefficients smoothed out this fine level information in the higher frequency regions due to the larger bandwidth of filters. To capture the sub-band information, in this paper, four-level discrete wavelet transform (DWT) decomposition is firstly performed to decompose the input speech signal into approximation and detail coefficients, respectively, at each level. For a particular input speech signal, five speech signals representing different sub-bands are then reconstructed using inverse DWT (IDWT). The log filterbank energies are computed by analyzing the short-term discrete Fourier transform magnitude spectra of each reconstructed speech using a 30-channel Mel-filterbank. For each analysis frame, the log filterbank energies obtained across all reconstructed speech signals are pooled together, and discrete cosine transform is performed to represent the cepstral feature, here termed as discrete wavelet transform reconstructed (DWTR)- Mel frequency cepstral coefficient (MFCC). The i-vector based dysarthric level assessment system developed on the universal access speech corpus shows that the proposed DTWRMFCC feature outperforms the conventional MFCC and several other cepstral features reported for a similar task. The usages of DWTR- MFCC improve the detection accuracy rate (DAR) of the dysarthric level assessment system in the text and the speaker-independent test case to 60.094 % from 56.646 % MFCC baseline. Further analysis of the confusion matrices shows that confusion among different dysarthric classes is quite different for MFCC and DWTR-MFCC features. Motivated by this observation, a two-stage classification approach employing discriminating power of both kinds of features is proposed to improve the overall performance of the developed dysarthric level assessment system. The two-stage classification scheme further improves the DAR to 65.813 % in the text and speaker- independent test case.
    Mostrar el registro completo del ítem
    Ficheros en el ítem
    icon
    Nombre: ijimai7_7_6.pdf
    Tamaño: 346.7Kb
    Formato: application/pdf
    Ver/Abrir
    Este ítem aparece en la(s) siguiente(s) colección(es)
    • vol. 7, nº 7, december 2022

    Estadísticas de uso

    Año
    2012
    2013
    2014
    2015
    2016
    2017
    2018
    2019
    2020
    2021
    2022
    2023
    2024
    2025
    Vistas
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    13
    73
    69
    98
    Descargas
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    9
    71
    55
    23

    Ítems relacionados

    Mostrando ítems relacionados por Título, autor o materia.

    • Infected Fruit Part Detection using K-Means Clustering Segmentation Technique 

      Dubey, Shiv Ram; Dixit, Pushkar; Singh, Nishant; Gupta, Jay Prakash (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 06/2013)
      Nowadays, overseas commerce has increased drastically in many countries. Plenty fruits are imported from the other nations such as oranges, apples etc. Manual identification of defected fruit is very time consuming. This ...
    • Robust Lossless Semi Fragile Information Protection in Images 

      Dixit, Pushkar; Singh, Nishant; Prakash Gupta, Jay (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 06/2014)
      Internet security finds it difficult to keep the information secure and to maintain the integrity of the data. Sending messages over the internet secretly is one of the major tasks as it is widely used for passing the ...
    • Analysis of Gait Pattern to Recognize the Human Activities 

      Prakash Gupta, Jay; Dixit, Pushkar; Singh, Nishant; Bhaskar Aemwal, Vijay (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 09/2014)
      Human activity recognition based on the computer vision is the process of labelling image sequences with action labels. Accurate systems for this problem are applied in areas such as visual surveillance, human computer ...

    Mi cuenta

    AccederRegistrar

    ¿necesitas ayuda?

    Manual de UsuarioContacto: reunir@unir.net

    Listar

    todo Re-UnirComunidades y coleccionesPor fecha de publicaciónAutoresTítulosPalabras claveTipo documentoTipo de accesoEsta colecciónPor fecha de publicaciónAutoresTítulosPalabras claveTipo documentoTipo de acceso






    Aviso Legal Política de Privacidad Política de Cookies Cláusulas legales RGPD
    © UNIR - Universidad Internacional de La Rioja
     
    Aviso Legal Política de Privacidad Política de Cookies Cláusulas legales RGPD
    © UNIR - Universidad Internacional de La Rioja