Search code examples
pythonpython-2.7dictionaryiteratorpyyaml

pyyaml 3.11 pass dictionary to iterator?


I use following YAML data:

Document:
 InPath: /home/me
 OutPath: /home/me
 XLOutFile: TestFile1.xlsx

Sheets: 
  - Sheet: Test123
    InFile: Test123.MQSC
    Server: Testsystem1
  - Sheet: Test345
    InFile: Test345.MQSC
    Server: Testsystem2

Title:
    A: "Server Name"
    B: "MQ Version"
    C: "Broker Version"

Fields:
    A: ServerName
    B: MQVersion
    C: BrokerVersion

and following code:

import yaml

class cfgReader():
    def __init__(self):
        self.stream = ""
        self.ymldata = ""
        self.ymlkey = ""
        self.ymld = ""

    def read(self,infilename):
        self.stream = self.stream = file(infilename, 'r') #Read the yamlfile
        self.ymldata = yaml.load(self.stream)    #Instanciate yaml object and parse the input "stream".

    def docu(self):
        print self.ymldata
        print self.ymldata['Sheets']
        for self.ymlkey in self.ymldata['Document']: #passes String to iterator
            print self.ymlkey
        for sheets in self.ymldata['Sheets']:  #passes Dictionary to iterator
            print sheets['Sheet']
        for title in self.ymldata['Title']:
            print title
        for fields in self.ymldata['Fields']:
            print fields

The print output is:

{'Fields': {'A': 'ServerName', 'C': 'BrokerVersion', 'B': 'MQVersion'}, 'Document': {'XLOutFile': 'TestFile1.xlsx', 'InPath': '/home/me', 'OutPath': '/home/me'}, 'Sheets': [{'Sheet': 'Test123', 'InFile': 'Test123.MQSC', 'Server': 'Testsystem1'}, {'Sheet': 'Test345', 'InFile': 'Test345.MQSC', 'Server': 'Testsystem2'}], 'Title': {'A': 'Server Name', 'C': 'Broker Version', 'B': 'MQ Version'}}
[{'Sheet': 'Test123', 'InFile': 'Test123.MQSC', 'Server': 'Testsystem1'}, {'Sheet': 'Test345', 'InFile': 'Test345.MQSC', 'Server': 'Testsystem2'}]
X
I
O
Test123
Test345
A
C
B
A
C
B

I could not find out how to control the way data is passed to the iterator. What I want is to pass it as dictionaries so that I can access the value through the key. This works for "Sheets" but I don't understand why. The documentation was not describing it clearly : http://pyyaml.org/wiki/PyYAMLDocumentation


Solution

  • In your code self.ymldata['Sheets'] is a list of dictionaries because your YAML source for that:

      - Sheet: Test123
        InFile: Test123.MQSC
        Server: Testsystem1
      - Sheet: Test345
        InFile: Test345.MQSC
        Server: Testsystem2
    

    is a sequence of mappings (and this is the value for the key Sheets of the top-level mapping in your YAML file).

    The values for the other top-level keys are all mappings (and not sequences of mappings), which get loaded as Python dict. And if you iterate over a dict as you do, you get the key values.

    If you don't want to iterate over these dictionaries then you should not start a for loop. You might want to test what the value for a toplevel keys is and then act accordingly, e.g. to print out all dictionaries loaded from the YAML file except for the top-level mapping do:

    import ruamel.yaml as yaml
    
    class CfgReader():
        def __init__(self):
            self.stream = ""
            self.ymldata = ""
            self.ymlkey = ""
            self.ymld = ""
    
        def read(self, infilename):
            self.stream = open(infilename, 'r') # Read the yamlfile
            self.ymldata = yaml.load(self.stream)    # Instanciate yaml object and parse the input "stream".
    
        def docu(self):
            for k in self.ymldata:
                v = self.ymldata[k]
                if isinstance(v, list):
                    for elem in v:
                        print(elem)
                else:
                    print(v)
    
    cfg_reader = CfgReader()
    cfg_reader.read('in.yaml')
    cfg_reader.docu()
    

    which prints:

    {'InFile': 'Test123.MQSC', 'Sheet': 'Test123', 'Server': 'Testsystem1'}
    {'InFile': 'Test345.MQSC', 'Sheet': 'Test345', 'Server': 'Testsystem2'}
    {'B': 'MQVersion', 'A': 'ServerName', 'C': 'BrokerVersion'}
    {'B': 'MQ Version', 'A': 'Server Name', 'C': 'Broker Version'}
    {'XLOutFile': 'TestFile1.xlsx', 'InPath': '/home/me', 'OutPath': '/home/me'}
    

    Please also note some general things, you should be aware off

    • I use ruamel.yaml (disclaimer: I am the author of that package), which supports YAML 1.2 (PyYAML supports the 1.1 standard from 2005). For your purposes they act the same.
    • don't use file() it is not available in Python3, use open()
    • assigning the same value twice to the same attribute makes no sense (self.stream = self.stream = ...)
    • your opened file/stream never gets closed, you might want to consider using

      with open(infilename) as self.stream:
          self.ymldata = yaml.load(self.stream)
      
    • class names, by convention, should start with an upper case character.