Search code examples
pythonnestedglom

Looking for the likely better way to get at nested data with glom?


I have a particularly nasty stats object from a system that I need to retrieve data from (two of the many stats entries shown for brevity).

 'https://localhost/mgmt/tm/sys/performance/all-stats/TMM%20Memory%20Used': {'nestedStats': {'entries': {'Average': {'description': '5'},
                                                                                                         'Current': {'description': '5'},
                                                                                                         'Max(since 2019_11_12T02:47:10Z)': {'description': '5'},
                                                                                                         'Memory Used': {'description': 'TMM '
                                                                                                                                        'Memory '
                                                                                                                                        'Used'}}}},
 'https://localhost/mgmt/tm/sys/performance/all-stats/Utilization': {'nestedStats': {'entries': {'Average': {'description': '9'},
                                                                                                 'Current': {'description': '10'},
                                                                                                 'Max(since 2019_11_12T02:47:10Z)': {'description': '53'},
                                                                                                 'System CPU Usage': {'description': 'Utilization'}}}}}

Currently I use the .get method multiple times in the nested stacks, but I was listening to the author of the glom module on Talk Python this weekend and thought that might be a far cleaner solution for me. And it is, as this code makes it so that I have all the data in a loop without crazy layers of get methods (first example pictured above that I'm working on tonight). The outer key is the long URL, the inner key is the avg/current/max/desc.

stats = b.tm.sys.performances.all_stats.load()
for k, v in stats.entries.items():
    print('\n')
    spec = f'entries.{k}.nestedStats.entries'
    v_stats = glom(stats, spec)
    for k, v, in v_stats.items():
        spec = f'{k}.description'
        stat_vals = glom(v_stats, spec)
        print(f'{k}: {stat_vals}')

Which results the data I need:

Average: 5
Current: 5
Max(since 2019_11_12T02:47:10Z): 5
Memory Used: TMM Memory Used

That said, I don't really have control of the data at this point, I'm just printing it. I don't think I'm grokking the power of glom just yet and was curious if someone could point me to an example that'll help my understanding? End goal is to flatten all this data into a single list of 4 item dictionaries.


Solution

  • First, before you try this, make sure glom is updated to the current version 19.11.0 or better.

    What you ask for, is called Data-Driven-Assignment by glom's docs and not a strength of glom.

    See the glom docs here

    To get it to work, you may need lambdas and/or regular Python code.

    Below is my working attempt, copy your example lines into the variable d.

    from glom import glom, Call, T, Iter
    
    d = { ... }  # put your example lines into this dictionary.
    
    def get_desc(subdict):
        return {k: v.get('description', None) 
                for k,v in subdict[1]['nestedStats']['entries'].items()}
    
    spec = (Call(list, args=(T.items(),) ), Iter().map(get_desc).all())
    
    result = glom(d, spec)
    
    print(result)
    

    results in

    [
    {'Average': '5', 'Current': '5', 'Max(since 2019_11_12T02:47:10Z)': '5', 'Memory Used': 'TMM Memory Used'}, 
    {'Average': '9', 'Current': '10', 'Max(since 2019_11_12T02:47:10Z)': '53', 'System CPU Usage': 'Utilization'}
    ]
    

    UPDATE

    The version below gets the same result, but avoids the need for a helper function.

    What the spec does:

    • Call turns outer dict into a list of tuples
    • Iter loops over the list. For each item:
      1. Take second element of tuple
      2. get nestedStats.entries (which is another dict)
      3. Call turns this dict into a list of tuples
      4. Turn this list into a list of dicts with key and description
      5. merge the list of dicts into one dict
    • take all the results from the iteration

    I recommend trying this and removing parts of the spec to see what happens...

    from glom import glom, Call, T, Iter, merge
    
    # d = { ... }  # put your example lines into this dictionary.
    
    spec = (
        Call(list, args=(T.items(),)),
        Iter(
            (
                T[1],
                "nestedStats.entries",
                Call(list, args=(T.items(),)),
                [{T[0]: (T[1], "description")}],
                merge,
            )
        ).all(),
    )
    
    result = glom(d, spec)
    
    print(result)