Robust Features in Deep Neural Networks for Transcoded Speech Recognition DSR and AMR NB

EasyChair Preprint 13167

6 pages•Date: May 2, 2024

Lallouani Bouchakour, Mohamed Debyeche and Ahmed Krobba

Abstract

Automatic speech recognition (ASR) performance in mobile communications degrades significantly when the environment contains many sources of variability. For example, when the test environment differs from the training environment, and when the acoustic environment contains disturbances such as noise, channel distortion, speaker differences, and mobile codecs. In this work, we have used two mobile network speech recognition architectures. The first one is Distributed Speech Recognition based on the DSR codec, and the second architecture is based on the Adaptive Multi-Rate Narrow-Band (AMR-NB) codec. We propose a novel robust feature extraction (Front-End) technique to improve speech recognition performance in noisy mobile communications. This technique utilizes special parameters such as Gabor features, Power Normalized Spectrum Gabor filter (PNS-Gabor), and Power Standardized Cepstral Coefficients (PNCC). These features consider psychoacoustic effects like the temporal masking effect and have different distributions of filter banks and filter forms to better model human perception. In the back end, we investigated speech classification systems using Continuous Hidden Markov Models (CHMM) and Deep Neural Networks (DNN). Based on the results obtained in noisy mobile communications, the proposed features PNS-Gabor and PNCC show significant improvements over conventional acoustic features such as Mel frequency cepstral coefficients (MFCC).

Keyphrases: AMR-NB, ASR, DNN, DSR, HMM, MFCC, PN-Gabor, PNCC

Links:

https://easychair.org/publications/preprint/frSK

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:13167,
  author    = {Lallouani Bouchakour and Mohamed Debyeche and Ahmed Krobba},
  title     = {Robust Features in Deep Neural Networks for Transcoded Speech Recognition DSR and AMR NB},
  howpublished = {EasyChair Preprint 13167},
  year      = {EasyChair, 2024}}

Download PDF Open PDF in browser