Search code examples
python-3.xpyyaml

pyyaml yaml.load returns str instead of dict


Using pyyaml and python3, the following in-memory yaml string loads without error but yaml.full_load() returns a str and not a dict. Is this expected behavior? Thanks.

main:
  orgs:
    &org1 org1:
      categories:
        - one
        - two
        - three
    &org2 org2:
      categories:
        - one
        - two
  people:
    - userid: user1
      org: *org1
      roles:
        - roleA
        - roleB
    - userid: user2
      org: *org2
      roles:
        - roleA
print("MAIN_YAML = " + os.getenv("MAIN_YAML"))

try:
    MAIN_YAML = yaml.full_load(os.getenv("MAIN_YAML"))
    print("PARSED SUCCESSFULLY")
    print(isinstance(MAIN_YAML, dict))
    print(type(MAIN_YAML))
except (yaml.YAMLError, yaml.YAMLLoadWarning) as e:
    print(e)
MAIN_YAML = main:orgs:&org1org1:categories:-one-two-three&org2org2:categories:-one-twopeople:-userid:user1org:*org1roles:-roleA-roleB-userid:user2org:*org2roles:-roleA
PARSED SUCCESSFULLY
False
<class 'str'>

Here's the shell script that creates the one-liner:

tr -d '\n\t' < main.yaml > temp.yaml
tr -d ' ' < temp.yaml > main_squeezed.yaml
MAIN_YAML=$(cat main_squeezed.yaml)

Solution

  • It appears you're loading it from an environment variable which doesn't have the YAML file "as is" (with newline characters).

    It works when the string has the newline characters included:

    >>> s = """main:
    ...   orgs:
    ...     &org1 org1:
    ...       categories:
    ...         - one
    ...         - two
    ...         - three
    ...     &org2 org2:
    ...       categories:
    ...         - one
    ...         - two
    ...   people:
    ...     - userid: user1
    ...       org: *org1
    ...       roles:
    ...         - roleA
    ...         - roleB
    ...     - userid: user2
    ...       org: *org2
    ...       roles:
    ...         - roleA"""
    >>>
    >>> import yaml
    >>> yaml.full_load(s)
    {'main': {'orgs': {'org1': {'categories': ['one', 'two', 'three']}, 'org2': {'categories': ['one', 'two']}}, 'people': [{'userid': 'user1', 'org': 'org1', 'roles': ['roleA', 'roleB']}, {'userid': 'user2', 'org': 'org2', 'roles': ['roleA']}]}}
    

    It doesn't work when the string is one line of text:

    >>> t = s.replace('\n', '').replace(' ', '')  # same thing, but one line
    >>> yaml.full_load(t)
    'main:orgs:&org1org1:categories:-one-two-three&org2org2:categories:-one-twopeople:-userid:user1org:*org1roles:-roleA-roleB-userid:user2org:*org2roles:-roleA'