I have a setup where I am publishing messages to Google Cloud PubSub service.
I wish to get the size of each individual message that I am publishing to PubSub. So for this, I identified the following approaches (Note: I am using the Python clients for publishing and subscribing, following a line-by-line implementation as presented in their documentation):
message.size
in the callback function for the messages that are being pulled from the requested topic.sys.getsizeof()
For a sample message like as follows which I published using a Python publisher client:
{
"data": 'Test_message',
"attributes": {
'dummyField1': 'dummyFieldValue1',
'dummyField2': 'dummyFieldValue2'
}
}
, I get the size as 101 as the message.size
output from the following callback function in the subcription client:
def callback(message):
print(f"Received {message.data}.")
if message.attributes:
print("Attributes:")
for key in message.attributes:
value = message.attributes.get(key)
print(f"{key}: {value}")
print(message.size)
message.ack()
Whereas the size displayed on Cloud Console Monitoring is something around 79 B.
So these are my questions:
message.size
in bytes?In order to further contribute to the community, I am summarising our discussion as an answer.
message.size
, it is an attribute from a message in the subscriber client. In addition, according to the documentation, its definition is:Returns the size of the underlying message, in bytes
Thus you would not be able to use it before publishing.
message_size
is a metric in Google Cloud Metrics and it is used by Cloud Monitoring, here.Finally, the last topic discussed was that your aim is to monitor your quota expenditure, so you can stay in the free tier. For this reason, the best option would be using Cloud Monitoring and setup alerts based on the metrics such as pubsub.googleapis.com/topic/byte_cost
. Here are some links, where you can find more about it: Quota utilisation, Alert event based, Alert Policies.