Interspeech 2017

From August 20-24, 2017, the MALORCA team has presented first research outcomes on semi-supervised learning at Interspeech 2017 conference in Stockholm, Sweden. The paper titled “Semi-supervised Learning with Semantic Knowledge Extraction for Improved Speech Recognition in Air Traffic Control” was introduced as an oral presentation during the Acoustic Model Adaptation session. The copy of the paper and the presentation can be viewed and downloaded from the Idiap publication website.

As mentioned above, the scientific work performed during first 3 months of 2017 concentrated on preliminary R&D toward semi-supervised learning for Automatic Speech Recognition (ASR) in the context of Air Traffic Control (ATC) scenario. The presented paper first identifies the challenges in building ASR systems for specific ATC areas and proposes to utilize out-of-domain data (i.e. large manually transcribed speech corpora recorded in different environments) to build baseline ASR models. The paper further proposes to employ large amount of untranscribed speech data which is usually available in ATC domain as the ATC service is continuous and thus amount of such data is increasing.

The paper explores different methods of data selection for adapting baseline acoustic and language speech recognition models by exploiting the continuously increasing untranscribed data. A basic approach is finally developed which is capable of exploiting semantic representations of ATC commands. The work is evaluated on a limited amount of transcribed data available from Vienna ATC. Quite significant improvement in both word error rate (relative 23.5%) and concept error rate (relative 7%) are achieved when adapting ASR models to different ATC conditions in a semi-supervised manner.

The current ongoing work focuses on using additional amounts of untranscribed data for utterance selection, which is assumed to lead to improved performance. Further significant improvements are expected from integrating additional semantic information and other modalities such as radar data to develop improved data selection methods for semi-supervised learning.