Search code examples
node.jslambdaamazon-kinesis

Kinesis Lambda Source Transformation: data attribute not base64 encoded


I have a log stream in Kinesis that I am trying to transform before it's sent out to our log monitoring service. The data being sent is a JSON array, I want to pull the single object out of the array and send this object to the log monitoring service.

The documentation says that the log data is base64 encoded, but my efforts to decode it have failed.

The code below is what I'm using to transform the data.

const parseData = (data) => {
 
  // Attempt to base64 decode the data

  // This doesn't decode the string into recognisable characters.
  const dataString = dataString = Buffer.from(data, 'base64').toString('ascii')
  
  // Attempt to parse the base64 decoded data as JSON
  const parsedData = JSON.parse(dataString)[0]
  
  return Buffer.from(JSON.stringify(parsedData)).toString('base64')
}

exports.handler = async (event, context) => {
    const output = event.records.map((record) => ({
        recordId: record.recordId,
        result: 'Ok',
        data: parseData(record),
    }));
    console.log(`Processing completed.  Successful records ${output.length}.`);
    return { records: output };
};

An example event.record:

{ 
  recordId: '...', 
  approximateArrivalTimestamp: 123, 
  data: 'H4sIAAAAAAAAAG1STY/TMBD9K5GvxKmdtkuaVSW+ynJgOZUDQqhy7WljNrGN7UQpK/4746Yte0DKYfLm4715nu/PZKeNgpHUREgJIdCSlYyyis45ycmbqDsIUXSO1Hy5qqpFtcL8apGTxoaIXdpRzvDjlN8xWs4ZdgnnWi1F1NZghWvFiQqnMQFm0N6aDkxqVTAgNoAPUyEvWIGj+GtOoWJCLdVBCkY7ESJ4rOwgNlZh4cNmi7+9bzGeCRl7Ea2fNSDa2GDiVw/+REP02hyxAhHcIPaB1CVjKKIVLoCiaTXaIcpxVkCG2vRtm5P9KUKg4ayRL3PiobMRqFDKoz9JJytw34Ivq4LfJZf8uEvmuUjBSKsm2uNv7fJM2s6lNlBTnbTGgLw4I1sbYMIvbl5H37ECraxX+BhTPgmk4jgZt/n8jn46b/u+AfkEflYWyfc47kZ68AL3si5xJLUfNl++TTkpZAMUFURvk3XG0jOUZxgFtBCjToyJZs0w7EOkHgbRaiUiXOenfpRB48m9pDE2GH04XJj+tyXizotjJ15QTzCMTqNJiN+2GPESnUffr1P4fdZZBet9a+XTVHVWVZNHa/JszrO3/TFLx5uxRb3g9aLMHh63Nz03zenS/93nbDCqCC6dCt1bG4vrORVD+epnsOZeNsIHiOuv24+0In9+/AUluUgIMwMAAA==', 
  kinesisRecordMetadata: { 
    sequenceNumber: '...', 
    subsequenceNumber: 0, 
    partitionKey: '...', 
    shardId: '...', 
    approximateArrivalTimestamp: 123 
  } 
}

When I attempt to base64 decode the above data value with the code below, I get garbled output:

const data = 'H4sIAAAAAAAAAG1STY/TMBD9K5GvxKmdtkuaVSW+ynJgOZUDQqhy7WljNrGN7UQpK/4746Yte0DKYfLm4715nu/PZKeNgpHUREgJIdCSlYyyis45ycmbqDsIUXSO1Hy5qqpFtcL8apGTxoaIXdpRzvDjlN8xWs4ZdgnnWi1F1NZghWvFiQqnMQFm0N6aDkxqVTAgNoAPUyEvWIGj+GtOoWJCLdVBCkY7ESJ4rOwgNlZh4cNmi7+9bzGeCRl7Ea2fNSDa2GDiVw/+REP02hyxAhHcIPaB1CVjKKIVLoCiaTXaIcpxVkCG2vRtm5P9KUKg4ayRL3PiobMRqFDKoz9JJytw34Ivq4LfJZf8uEvmuUjBSKsm2uNv7fJM2s6lNlBTnbTGgLw4I1sbYMIvbl5H37ECraxX+BhTPgmk4jgZt/n8jn46b/u+AfkEflYWyfc47kZ68AL3si5xJLUfNl++TTkpZAMUFURvk3XG0jOUZxgFtBCjToyJZs0w7EOkHgbRaiUiXOenfpRB48m9pDE2GH04XJj+tyXizotjJ15QTzCMTqNJiN+2GPESnUffr1P4fdZZBet9a+XTVHVWVZNHa/JszrO3/TFLx5uxRb3g9aLMHh63Nz03zenS/93nbDCqCC6dCt1bG4vrORVD+epnsOZeNsIHiOuv24+0In9+/AUluUgIMwMAAA=='

console.log(Buffer.from(data, 'base64').toString('ascii'))

Output:

TDH+~;c&!P@Jarfc=yoOd'mic61
          2
N9IQtT|9**E5B|j]ZQNpc_1ZNv      gZ-ETV`kE
'1fP^LjU0 6S!/X#xkN!bB-UA
F;"x,l 6VaaCf
             ?=o1       {-5 ZX`bW~DCtZ1\ vT%c("."i5Z!JqV@Ztm)B a,/sb!3(PJ#?I'+p_/+_%|8Kf9HAH+&ZcomrLZN%6PS4F<8#[B/n^G_1-,WxS>   $b87y|~:o{>y~VIw8nFzpw2.q$56_>M9)dDouFR3g4#N
                                        fM0lC$Qj%"\g'~AcI=$16}8\~7%bN
                                                                     c'^PO0
                                                                           N#_6qG_/Sx}VYk}keSTuVUGkrlN37}1KGE=`u"L77=7MiR]gl0.
][

Question:

What am I doing wrong here? Is there another way I'm supposed to be decoding these values?


Solution

  • The data was compressed. A working example:

    const zlib = require('zlib');
    const data = 'H4s... etc';
    const compressed = Buffer.from(data, 'base64');
    const decompressed = zlib.unzipSync(compressed);
    console.log(decompressed.toString());