Search code examples
pythonpython-3.xrecursiongeneratoryield-from

How to save the output of recursive function to list of items using yield and generator functions


I have the following XML file from this link as sample:

I have the following recursive function which prints output:

import xml.etree.ElementTree as ET

def perf_func(elem, func, level=0):
    func(elem,level)
    for child in elem.getchildren():
        perf_func(child, func, level+1)

def print_level(elem,level):
    print('-'*level+elem.tag)

elemList = ['description', 'episodes', 'movie', 'collection', 'stars', 'rating', 'year', 'type', 'format']

xmlTree = ET.parse('XML_file.xml')

The below line prints the result:

perf_func(xmlTree.getroot(), print_level)

Output:

collection
-movie
--type
--format
--year
--rating
--stars
--description
-movie
--type
--format
--year
--rating
--stars
--description
-movie
--type

I need to save the output to a list of items like below.

hierarchy = [collection, -movie, --format, --year, --rating, ... ]

So tried the below modification. But unable to get the result as list.

import xml.etree.ElementTree as ET

def perf_func(elem, func, level=0):
    func(elem,level)
    for child in elem.getchildren():
        yield from perf_func(child, func, level+1)

def print_level(elem,level):
    print ('-'*level+elem.tag)

I trying to modify the print_level() function to give some returnable output instead of printing it, but don't know how to do it.

perf_func(xmlTree.getroot(), print_level)

<generator object perf_func at 0x000001F6432BD2C8>

Changing the generator to list giving me the same output

list(perf_func(xmlTree.getroot(), print_level))

I checked similar questions in other links, but could'nt understand them much.


Solution

  • There's no point in a function that uses yield from but never yields any value. The generator needs to be populated with data at some point for it to do anything.

    def perf_func(elem):
        yield elem
    
        for child in elem.getchildren():
            yield from perf_func(child)
    

    You could use yield func(elem, level), but passing a function into a generator is a somewhat odd pattern that inverts responsibility. The typical pattern for generators is to emit data lazily and let the caller apply arbitrary processing on each item inline, for example:

    def traverse(elem, level=0):
        yield elem, level
    
        for child in elem.getchildren():
            yield from traverse(child, level + 1)
    
    for elem, level in traverse(xmlTree.getroot()):
        print("-" * level + elem.tag) # or whatever else you want to do
    

    In Python 3.9, elem.getchildren was removed, so here's the code that worked for me:

    import xml.etree.ElementTree as ET
    
    def traverse(elem, level=0):
        yield elem, level
    
        for child in elem:
            yield from traverse(child, level + 1)
    
    for elem, level in traverse(ET.parse("country_data.xml").getroot()):
        print("  " * level + elem.tag) # or whatever else you want to do