Assistant Based Speech Recognition (ABSR)

Assistant Based Speech Recognition benefits from the combination of an Assistance System (e.g. an Arrival Manager) and a Speech Recognition System. Validation Trials in AcListant® showed that with ABSR, command recognition rates of 91.6% were achieved. Without using the input of ABSR only recognition rates between 58% and 83% were observed. The following figure (taken from H. Helmke et al. Assistant-Based Speech Recognition for ATM Applications, 23-26. June 2015, Lisbon, Portugal, 11th FAA/Eurocontrol ATM-Seminar) illustrates this combination.

Components of Assistance Based Speech Recognition System

The output of an Assistant System, the context, (e.g. aircraft sequences, distance-to-go, minimal separation, aircraft state vectors) is used by the “Hypotheses Generator” component. The “Hypotheses Generator” does not know exactly which commands the controller will give in the future, but it knows which of the possible commands are more probable than others in the current situation.

These hypotheses are entered into the “Automatic Speech Recognition” block, which itself consists of the following components: the “Speech recorder”, the “Lattice generator”, “Speech Recognizer”, and the “Command Extractor”. The output of the Speech Recognizer component might be e.g. “Lufthansa four nine six thank you normal speed however maintain one seven zero knots or greater to six miles final descend altitude tree thousand…” We are not interested in every single word of this utterance. We need the relevant concepts, which are marked in bold face. This is the task of the “Command Extractor”, which creates from the above example the command sequence “DLH496 SPEED_OR_ABOVE 170, DLH496 DESCEND 3000 ALT” The command extractor also assigns a plausibility value to each extracted command.

The extracted commands are sent back to the assistant system, namely to the “Plausibility Checker” component, which also uses e.g. the context knowledge, the plausibility values, and the command hypotheses to reject recognized commands. Commands which are not in the current context are further checked by data of other sensors. A turn left and a turn right command for the same aircraft in the same utterance is also immediately rejected etc. The “Command Monitor” tries to verify or falsify recognized commands through upcoming radar data.

MALORCA Project

Machine Learning of Speech Recognition Models for Controller Assistance

Assistant Based Speech Recognition (ABSR)