I wanted to write three attributes (data, attributes and publish time) of a Pub/Sub message to Bigquery and wanted them to print in a flattened way so that all elements writes in a single row, for example:
data[0] | data[1] | attr[0] | attr[0] | key | publishTime |
---|---|---|---|---|---|
data | data | attr | attr | key | publishTime |
I'm currently using the following piece of code for decoding and parsing the message but this is applicable only for the data part of the Pub/Sub message:
class decodeMessage:
def decode_base64(self,element):
"""Decode base64, padding being optional."""
return json.dumps(element.data.decode("utf-8"))
class parseMessage:
def parseJsonMessage(self,element):
return(json.loads(element))
I've also tried merging two json after dumping them from Json objects to Json string but it didn't go as planned, my ultimate goal is to bring all columns into a single JSON with the schema retained.
I hope my question remains clear to you! Thanks!
The solution to the following problem is to simply make a Python dictionary and append all the data into a new Dictionary.
example:
payload = dict()
data = json.dumps(element.data.decode('utf-8'))
attributes = json.dumps(element.attributes)
messageKey = element.message_id
publish_time = (element.publish_time).timestamp()*1000
payload['et'] = publish_time
payload['data'] = data
payload['attributes'] = attributes
payload['key'] = messageKey
return (payload)