Search code examples
pythonpyyaml

Python YAML parsing raising KeyError


I am trying to build an audit report of which users will have access to which topics in our kafka environment. I wrote a piece of python code that does this, but for some reason it stopped working, and I have no idea why.

I have the following code:

import yaml
import glob

yaml_file_names = glob.glob('/tmp/kafka-topics/topics/*.yaml')
file1 = open("Access_Audit_Report_testenv.txt", "w")
file1.close()
for each_yaml_file in yaml_file_names:
    with open(each_yaml_file) as f:
        document = yaml.safe_load(f)
        current_context = document["context"]
        for each_project in document['projects']:
            file1 = open("Access_Audit_Report_testenv.txt", "a")
            for each_topic in each_project['topics']:
                topic_name = '.'.join([current_context, each_project['name'], each_topic['name']])
                file1.write(topic_name+"\n")
                file1.write('Consumers:\n')


                for each_entry in each_topic['consumers']:
                    file1.write(str(each_entry)+"\n")

                file1.write(''+"\n")
                file1.write('Producers:\n')

                for each_entry in each_topic['producers']:
                    file1.write(str(each_entry)+"\n")

                file1.write(''+"\n")
file1.close()

A typical yaml file in the above-mentioned location looks like this:

---
context: "building_administration"
projects:
  - name: "school"
    topics:
      - name: "publish"
        plan: "default"
        consumers:
          - principal: "User:consumer_planning_test"
        producers:
          - principal: "User:producer_maintenance_test"

  - name: "iot"
    topics:
      - name: "metrics.publish"
        plan: "default"
        consumers:
          - principal: "User:consumer_maintenance_test"
          - principal: "User:consumer_planning_test"
          
        producers:
          - principal: "User:producer_planner_test"
...
      

The desired output should look something like this:

building_administration.school.publish
Consumers:
{'principal': 'User:consumer_planning_test'}

Producers:
{'principal': 'User:producer_maintenance_test'}

iot.metrics.publish
Consumers:
{'principal': 'User:consumer_maintenance_test'}
{'principal': 'User:consumer_planning_test'}

Producers:
{'principal': 'User:producer_planner_test'}

The error I am getting is:

Traceback (most recent call last):
  File "/tmp/kafka-audit-builder/build-audit-testenv.py", line 13, in <module>
    for each_topic in each_project['topics']:
KeyError: 'topics'

So I understand what a KeyError is, I just don't understand why I am seeing it? Especially since this code had been working before. Any ideas or help will really be greatly appreciated.

Thank you!


Solution

  • Thanks to @franklinsijo in the comments, I was able to solve my problem by adapting my code to the following:

    import yaml
    import glob
    
    # yaml_file_names = glob.glob('/builds/devops1/kafka-topics/topics/*.yaml')
    yaml_file_names = glob.glob('/tmp/kafka-topics/topics/*.yaml')
    file1 = open("Access_Audit_Report_testenv.txt", "w")
    file1.close()
    for each_yaml_file in yaml_file_names:
        with open(each_yaml_file) as f:
            document = yaml.safe_load(f)
            current_context = document["context"]
            print('###################')
            print(current_context)
            for each_project in document['projects']:
                file1 = open("Access_Audit_Report_testenv.txt", "a")
                if 'topics' in each_project:
                    for each_topic in each_project['topics']:
                        topic_name = '.'.join([current_context, each_project['name'], each_topic['name']])
                        print(topic_name)
                        file1.write(topic_name+"\n")
                        file1.write('Consumers:\n')
    
                        if 'consumers' in each_topic:
                            for each_entry in each_topic['consumers']:
                                file1.write(str(each_entry)+"\n")
                                print('Consumer:', each_entry)
    
                        file1.write('Producers:\n')
    
                        if 'producers' in each_topic:
                            for each_entry in each_topic['producers']:
                                file1.write(str(each_entry)+"\n")
                                print('Producer:', each_entry)
                        file1.write(''+"\n")
                else:
                    continue
    file1.close()