Search code examples
pythonlistpython-2.7dictionaryrecursion

Python: How to RECURSIVELY remove None values from a NESTED data structure (lists and dictionaries)?


Here is some nested data, that includes lists, tuples, and dictionaries:

data1 = ( 501, (None, 999), None, (None), 504 )
data2 = { 1:601, 2:None, None:603, 'four':'sixty' }
data3 = OrderedDict( [(None, 401), (12, 402), (13, None), (14, data2)] )
data = [ [None, 22, tuple([None]), (None,None), None], ( (None, 202), {None:301, 32:302, 33:data1}, data3 ) ]

Goal: Remove any keys or values (from "data") that are None. If a list or dictionary contains a value, that is itself a list, tuple, or dictionary, then RECURSE, to remove NESTED Nones.

Desired output:

[[22, (), ()], ((202,), {32: 302, 33: (501, (999,), 504)}, OrderedDict([(12, 402), (14, {'four': 'sixty', 1: 601})]))]

Or more readably, here is formatted output:

StripNones(data)= list:
. [22, (), ()]
. tuple:
. . (202,)
. . {32: 302, 33: (501, (999,), 504)}
. . OrderedDict([(12, 402), (14, {'four': 'sixty', 1: 601})])

I will propose a possible answer, as I have not found an existing solution to this. I appreciate any alternatives, or pointers to pre-existing solutions.

EDIT I forgot to mention that this has to work in Python 2.7. I can't use Python 3 at this time.

Though it IS worth posting Python 3 solutions, for others. So please indicate which python you are answering for.


Solution

  • If you can assume that the __init__ methods of the various subclasses have the same signature as the typical base class:

    def remove_none(obj):
      if isinstance(obj, (list, tuple, set)):
        return type(obj)(remove_none(x) for x in obj if x is not None)
      elif isinstance(obj, dict):
        return type(obj)((remove_none(k), remove_none(v))
          for k, v in obj.items() if k is not None and v is not None)
      else:
        return obj
    
    from collections import OrderedDict
    data1 = ( 501, (None, 999), None, (None), 504 )
    data2 = { 1:601, 2:None, None:603, 'four':'sixty' }
    data3 = OrderedDict( [(None, 401), (12, 402), (13, None), (14, data2)] )
    data = [ [None, 22, tuple([None]), (None,None), None], ( (None, 202), {None:301, 32:302, 33:data1}, data3 ) ]
    print remove_none(data)
    

    Note that this won't work with a defaultdict for example since the defaultdict takes and additional argument to __init__. To make it work with defaultdict would require another special case elif (before the one for regular dicts).


    Also note that I've actually constructed new objects. I haven't modified the old ones. It would be possible to modify the old objects if you didn't need to support modifying immutable objects like tuple.