Search code examples
javaamazon-web-servicesamazon-s3aws-lambdaamazon-kinesis-firehose

Does AWS Lambda/Firehose support Base64 URL decoding?


My pipeline is as follows:

Firehose -> Lambda (AWS' Java SDK) -> (S3 & Redshift)

An un-encoded (raw) JSON record is submitted to Firehose. It then triggers a Lambda function which transforms it slightly. Firehose then puts the transformed record into an S3 bucket and into Redshift.

For Firehose to add the transformed data to S3, it requires that the data be Base64 encoded (and Firehose decodes it before adding it to S3).

However, I have a URL within the data that, when decoded, = characters are replaced with their equivalent unicode character (\u003d) due to it being the character that Amazon's Base64 decoder uses as padding.

https://www.[snipped].com/...?returnurl\u003dnull\u0026referrer\u003dnull

How can I retain those = characters within the decoded data?

Note: I've tried using Base64.getUrlEncoder(), but AWS only seems to support Base64.getEncoder().


Solution

  • It turns out that HTML escaping was enabled on the JSON library (Gson) that I was using when (de)serializing my Lambda record. To fix it, I just had to disable HTML escaping:

    new GsonBuilder().disableHtmlEscaping().create();