Search code examples
androidtext-to-speech

Huh? Why doesn't playEarcon() produce onUtteranceCompleted()?


An Android book I have states that using TextToSpeech.playEarcon() is preferable to playing audio files (using MediaPlayer) because:

Instead of having to determine the opportune moment to play an audible cue and relying on callbacks to get the timing right, we can instead queue up our earcons among the text we send to the TTS engine. We then know that our earcons will be played at the appropriate time, and we can use the same pathway to get our sounds to the user, including the onUtteranceCompleted() callbacks to let us know where we are.

But my short and simple experiment with this shows this isn't the case:

String utteranceId = String.valueOf(utteranceNum++);
params.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, utteranceId);
params.put(TextToSpeech.Engine.KEY_PARAM_STREAM, String.valueOf(AudioManager.STREAM_MUSIC));
tts.speak("FIRST part of sentence", TextToSpeech.QUEUE_ADD, params);

utteranceId = String.valueOf(utteranceNum++);
params.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, utteranceId);
params.put(TextToSpeech.Engine.KEY_PARAM_STREAM, String.valueOf(AudioManager.STREAM_MUSIC));
tts.playEarcon("[fancyring]", TextToSpeech.QUEUE_ADD, params);

utteranceId = String.valueOf(utteranceNum++);
params.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, utteranceId);
params.put(TextToSpeech.Engine.KEY_PARAM_STREAM, String.valueOf(AudioManager.STREAM_MUSIC));
tts.speak("SECOND part of sentence", TextToSpeech.QUEUE_ADD, params);

When I examine the logs from onUtteranceCompleted() I only see the utteranceIds of the ones played by tts.speak(), not the one played by tts.playEarcon().

Why is this discrepancy? Is there a workaround for this?

P.S. At the risk of stating the obvious: All three utterances are played out fine and at the right order. It is only the onUtteranceCompleted() that isn't called for some reason for the tts.playEarcon().


Solution

  • Answering myself. The incredibly long and very detailed documentation about TextToSpeech.OnUtteranceCompletedListener reads (the emphasis is mine):

    Called when an utterance has been synthesized.

    An earcon is never a result of synthesization, so of course onUtteranceCompleted() will never be called for it. This is by design.

    Which gets us back to a new question: If there is no advantage to earcons over playing .mp3 files (using MediaPlayer), why use earcons at all?