Search code examples
actions-on-googlegoogle-home

Actions SDK: Handling unsupported utterance whilst playing media


Question

Is it possible to resume the content of a media object once a user has tried to interact with a Google Home device during the playing of that media?

The problem

Say you have started playing an mp3 file using conv.ask. Your call to conv.ask will look something along the lines of this:

conv.ask(`<speak><audio src="${someUrl1}"><desc>${someDescription}</desc></audio></speak>`)
    .add(new MediaObject({
        url: someUrl2
    }))
    .add(new Suggestions(['suggestion1', 'suggestion2']));

This plays all fine and well. But then say a user says something along the lines of 'Ok Google, Gobbledygoop', you then might want to tell the user that their request was nonsensical, and then continue the playing of the media in the media object.

What I have tried already

app.fallback(): This does not seem compatible with the actions SDK. It does not seem possible under any circumstance to get the callback (the one provided to app.fallback) to be called.

Providing conv.ask with null/empty string responses: This was a desperate attempt to see what would happen if you provided nothing to conv.ask. There was a hope that it would see the empty response and just keep playing the media.


Solution

  • There is nothing that is part of Actions on Google itself that will do this for you.

    The best you can probably do requires a lot of effort on your part:

    • You can include as part of the session state (or in a context) when you replied to the user with the Media result.

    • If you get another request before the Media Status event, you can determine the difference between the two, and this will roughly be how long the audio had been playing.

    • You can then return a URL for audio that includes starting at this point in the audio. However, the audio offset isn't something that the Assistant does, you'll have to support this on the server that contains the audio as well.

    As for the two things you attempted - app.fallback() should have worked to handle any intent that you had set to go to your webhook that didn't have any other handler defined, but that still wouldn't just be able to "resume" the audio. conv.ask() requires you to ask something - null replies aren't allowed.

    In this case, you at least want to tell the user that what they said made no sense... before resuming the audio.