DOI: 10.3397/in_2023_0224 ISSN: 0736-2935

Estimation of Japanese word intelligibility by automatic speech recognition with noise adaptation

Masaki Hattori, Kazuhiro Kondo

The purpose of this study is to estimate subjective word intelligibility using DNN-based speech recognition. The evaluation of intelligibility is important to guarantee high speech quality in ambient noise and reverberation. The Japanese Diagnostic Rhyme Test (JDRT) is one such subjective evaluation method. In this study, we estimated Japanese word intelligibility by simulating the JDRT with cloud-based speech recognition systems. Recent DNNs can improve recognition accuracy with their robust classification ability and efficient use of high-dimensional features. We built a DNN-based speech recognition system using cloud-based ASR APIs, and adaptively trained its model to several noise conditions to improve its intelligibility estimation accuracy. In this paper, we will report comparisons with other objective and subjective evaluations and discuss ideas for a more generic objective evaluation method.

More from our Archive