Search code examples
protocol-buffersgoogle-cloud-pubsub

Pub/Sub Get schema from topic


I want to use schemas to validate data before publishing via Google Pub Sub in Python. I associated a schema to a topic (gcloud pubsub topics create with --schema flag). In the publisher or subscribe code, is it possible to retrieve the schema class from the topic, similar to how it is possible to retrieve the encoding topic.schema_settings.encoding? Or do we need to copy/make accessible the .proto file and generate classes in the appropriate repos when we need to publish or parse messages?

Note: I was following this tutorial

publisher_client = PublisherClient()
topic_path = publisher_client.topic_path(project_id, topic_id)

# Get the topic encoding type.
topic = publisher_client.get_topic(request={"topic": topic_path})
encoding = topic.schema_settings.encoding

# Instantiate a protoc-generated class defined in `us-states.proto`.
state = us_states_pb2.StateProto()
state.name = "Alaska"
state.post_abbr = "AK"

Solution

  • You can retrieve schemas e.g. gcloud pubsub schemas [list|describe] ...

    And, as with all gcloud commands, they're backed by an API:

    gcloud pubsub schemas describe ... --log-http

    Can be found in APIs Explorer projects.schemas.get

    And, because they're part of the service's API, they are accessible through SDKs too:

    from google.cloud import pubsub_v1
    
    
    schema_service_client = pubsub_v1.SchemaServiceClient()
    
    project = "..."
    schema = "..."
    
    name = f"projects/{project}/schemas/{schema}"
    s = schema_service_client.get_schema(name=name)
    print(s.definition)
    

    s is type Schema

    But, you probably don't want to do this!

    It would require that you build the types (programmatically) at runtime which is possible but ill-advised.

    You should consider publishing your protobufs (and consider generating the language stubs) centrally, making these accessible to your developers and letting them access them for use.