Search code examples
pythonapigoogle-cloud-platformgoogle-data-catalog

Error Python API GCP Data Catalog - Google Cloud Platform


I'm getting an error trying of use the code of this link: Data Catalog Example. At the step 4, just copy pasting all the code provided, authenticating into my GCP Project and testing it.

Everything is ok until it start creating tag template fields...

tag_template = datacatalog_v1.types.TagTemplate()
tag_template.display_name = 'On-premises Tag Template'

tag_template.fields['source'].display_name = 'Source of the data asset'
tag_template.fields['source'].type.primitive_type = \
    datacatalog_v1.FieldType.PrimitiveType.STRING.value

it always crash with the same error.

Error Image

tag_template <proto.marshal.collections.maps.MapComposite object at 0x10fe23310>
Traceback (most recent call last):
  File "/Users/ac/Documents/DataCatalog/python_datacatalog/application/sample.py", line 149, in <module>
    tag_template.fields['source'].display_name = 'Source of the data asset'
  File "/Users/ac/Documents/DataCatalog/python_datacatalog/venv/lib/python3.8/site-packages/proto/marshal/collections/maps.py", line 56, in __getitem__
    raise KeyError(key)
KeyError: 'source'

Someone can help me sharing alternatives to do this?


Solution

  • The sample code on the Data Catalog Example is outdated. Did a few changes on the code starting at step 4 (where you are currently stuck). I encountered another error at the next line for primitive type.

    # -------------------------------
    # 4. Create a Tag Template.
    # -------------------------------
    tag_template = datacatalog_v1.types.TagTemplate()
    tag_template.display_name = 'On-premises Tag Template'
    
    tag_template.fields['source'] = datacatalog_v1.types.TagTemplateField() #creates key 'source'
    tag_template.fields['source'].display_name = 'Source of the data asset'
    tag_template.fields['source'].type_.primitive_type = datacatalog_v1.types.FieldType.PrimitiveType.STRING #from type -> type_, syntax for primitive type string
    
    • Fix is to create the key 'source' for TagTemplateField by adding tag_template.fields['source'] = datacatalog_v1.types.TagTemplateField()
    • Updated the syntax for "tag_template.type" and assigning primitive string value

    If you proceed to step 5, an error will pop up KeyError: 'source'. If you did not encounter this then it is all good. But if ever you encounter it here is the code to fix that.

    # -------------------------------
    # 5. Attach a Tag to the custom Entry.
    # -------------------------------
    tag = datacatalog_v1.types.Tag()
    tag.template = tag_template.name
    tag.fields['source'] = datacatalog_v1.types.TagField() #creates key 'source'
    tag.fields['source'].string_value = 'On-premises system name'
    
    tag = datacatalog.create_tag(parent=entry.name, tag=tag)
    print('Created tag: {}'.format(tag.name))
    
    • Fix is similar to step 4 which is to create the key 'source' this time for TagField by adding tag.fields['source'] = datacatalog_v1.types.TagField() #creates key 'source'

    I ran the whole script from step 1 to 5.

    Output of the script: enter image description here

    Created tag template: enter image description here

    Created tag: enter image description here