Search code examples
amazon-web-servicesamazon-s3alexa-skills-kit

Alexa skill Invalid SSML Output Speech when using presigned S3 url


I'm having what looks like a common problem. I'm using the sample python code generated by amazon when you create an Alexa hosted skill to generate a presigned URL for an audio file in S3 like this:

url = utils.create_presigned_url(f"Media/file.mp3")

speak_output = f'<audio src="{url}"/>'

The URL that is generating is valid for 10 minutes and looks like this:

https://s3.eu-west-1.amazonaws.com/88516b63-75bc-4c6a-a6cf-d12860d50b4e-eu-west-1/Media/file.mp3?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAYFCHQ4OU2ZBBLLU4%2F20200926%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20200926T082211Z&X-Amz-Expires=600&X-Amz-SignedHeaders=host&X-Amz-Security-Token=IQoJb3JpZ2luX2VjENH%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCWV1LXdlc3QtMSJIMEYCIQDfV53BV52momhRGAQkPJVym%2FN%2FrzPMYmCT8SOBke6CKgIhAI4AGnWd9gpzMeQb%2B1nhQ4VusLLTzs2ZNZFcMpdxaSRPKuwBCOn%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEQABoMNTYwNjQzMjM2Nzc3IgzP4Hvvs9jrXQGNqCMqwAEysuGmMv8JVDGN4gUip2JPlmM7JxiEURGyhQxB9KGycK9y5hQQ6U%2BNd85W%2FZpzFksrzGagC8jJXPgJGSl0eTN1EF%2BOq%2Bk504K%2FfaSKAQr06G7Lb87CKrDFSssoUZGcztQ3oixft9Y3FMCPzd5%2FoO9AG%2BLoK2HzG2YEQyEq05aUqsNDQZG13znFpv80uLELBQc35Oy888k9PYC2tJBvMnuFpMOWr%2FZiGvZKQm8OnpYElmyHjAvY81%2Fl6CxXcuTFwRowq%2FG7%2BwU63wHmb0KR%2BPtPngH3cWgdzkBobNewKpXvczWurUPR9e73SJEjoHPS7VIkGGx6DkWXyUZIB8eOMuy0rZdnsR7MUPc6%2FMjRiqqbboLxqN%2B7vmls0tycCSe0INH%2Bo1Kkb2VZwU5kWb%2B3ZAHtzp4xERw2NQXvgVkK5gSoWbt7SGMJfjkp5cSbkxlI1aJxisgdWKWokZiU3GyqrlqVuNpRqom7ZZsuLTNOWTzUqQSx92F3ZNefJE17Jr%2Fyy6PSqA7DHRd%2Bv6Jiyuk35GSDdEeJClCofkByxnbu3da%2B8Qtnm%2FtDsUMV&X-Amz-Signature=16fa135dadc7cfa83f9bce10ffcad0e0aff7617414d88fae4b068af5cc2ebd73

So the final SSML returned by the function looks like:

{
    "body": {
        "version": "1.0",
        "sessionAttributes": {},
        "userAgent": "ask-python/1.11.0 Python/3.7.9",
        "response": {
            "outputSpeech": {
                "type": "SSML",
                "ssml": "<speak><audio src=\"superlongurl\"/></speak>"
            }
        }
    }
}

Which looks correct for me, but Alexa says:

"request": {
        "type": "SessionEndedRequest",
        "requestId": "amzn1.echo-api.request.effc325d-f969-4f3b-aa0e-d339c65ec349",
        "timestamp": "2020-09-26T08:22:12Z",
        "locale": "es-ES",
        "reason": "ERROR",
        "error": {
            "type": "INVALID_RESPONSE",
            "message": "Invalid SSML Output Speech for requestId amzn1.echo-api.request.43ec04ea-4a0c-4971-bb28-71de95141fd6. Error: Fatal error occurred when processing SSML content. This usually happens when the SSML is not well formed. Error: Unexpected character '=' (code 61); expected a semi-colon after the reference for entity 'X-Amz-Credential'\n at [row,col {unknown-source}]: [1,168]"
        }
    }

And I can't figure out why. I already read this question but the solution is making the object public in your own bucket, instead of using the one provided by alexa hosted skill, which doesn't look right to me.

I also tried this one escaping the characters with cgi (I used html package) but the response I get is:

"request": {
        "type": "SessionEndedRequest",
        "requestId": "amzn1.echo-api.request.b25a36d3-be89-4952-b688-ae2dc9b7710c",
        "timestamp": "2020-09-26T08:28:23Z",
        "locale": "es-ES",
        "reason": "ERROR",
        "error": {
            "type": "INVALID_RESPONSE",
            "message": "Invalid Audio Content for requestId amzn1.echo-api.request.5fac838c-0bd1-4704-bdaf-3cbf42d9c766. Error: The audio is not of a supported MPEG version"
        }
    }

Which doesn't make sense as the first URL works fine. If I try to access the escaped URL it doesn't allow me and complains about fields not present (the query params).

Does anyone has a real solution for this?

Thank you very much!


Solution

  • After talking with support several times and checking a lot of things. The problem was the bitrate and the sample rate of the MP3 file. You have all the requirements for the audio files here:

    https://developer.amazon.com/en-US/docs/alexa/custom-skills/speech-synthesis-markup-language-ssml-reference.html#audio