Search code examples
pythonlistdictionarypretty-print

Python pretty print dictionary of lists, abbreviate long lists


I have a dictionary of lists and the lists are quite long. How can I print it in a way that only a few elements of the list show up? Obviously, I can write a custom function for that but is there any built-in way or library that can achieve this? For example when printing large data frames, pandas prints it nicely in a short way.

This example better illustrates what I mean:

obj = {'key_1': ['EG8XYD9FVN',
  'S2WARDCVAO',
  'J00YCU55DP',
  'R07BUIF2F7',
  'VGPS1JD0UM',
  'WL3TWSDP8E',
  'LD8QY7DMJ3',
  'J36U3Z9KOQ',
  'KU2FUGYB2U',
  'JF3RQ315BY'],
 'key_2': ['162LO154PM',
  '3ROAV881V2',
  'I4T79LP18J',
  'WBD36EM6QL',
  'DEIODVQU46',
  'KWSJA5WDKQ',
  'WX9SVRFO0G',
  '6UN63WU64G',
  '3Z89U7XM60',
  '167CYON6YN']}

Desired output: something like this:

{'key_1':
    ['EG8XYD9FVN', 'S2WARDCVAO', '...'],
 'key_2':
    ['162LO154PM', '3ROAV881V2', '...']
}

Solution

  • If it weren't for the pretty printing, the reprlib module would be the way to go: Safe, elegant and customizable handling of deeply nested and recursive / self-referencing data structures is what it has been made for.

    However, it turns out combining the reprlib and pprint modules isn't trivial, at least I couldn't come up with a clean way without breaking (some) of the pretty printing aspects.

    So instead, here's a solution that just subclasses PrettyPrinter to crop / abbreviate lists as necessary:

    from pprint import PrettyPrinter
    
    
    obj = {
        'key_1': [
            'EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', 'R07BUIF2F7', 'VGPS1JD0UM',
            'WL3TWSDP8E', 'LD8QY7DMJ3', 'J36U3Z9KOQ', 'KU2FUGYB2U', 'JF3RQ315BY',
        ],
        'key_2': [
            '162LO154PM', '3ROAV881V2', 'I4T79LP18J', 'WBD36EM6QL', 'DEIODVQU46',
            'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G', '3Z89U7XM60', '167CYON6YN',
        ],
        # Test case to make sure we didn't break handling of recursive structures
        'key_3': [
            '162LO154PM', '3ROAV881V2', [1, 2, ['a', 'b', 'c'], 3, 4, 5, 6, 7],
            'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G', '3Z89U7XM60', '167CYON6YN',
        ]
    }
    
    
    class CroppingPrettyPrinter(PrettyPrinter):
    
        def __init__(self, *args, **kwargs):
            self.maxlist = kwargs.pop('maxlist', 6)
            return PrettyPrinter.__init__(self, *args, **kwargs)
    
        def _format(self, obj, stream, indent, allowance, context, level):
            if isinstance(obj, list):
                # If object is a list, crop a copy of it according to self.maxlist
                # and append an ellipsis
                if len(obj) > self.maxlist:
                    cropped_obj = obj[:self.maxlist] + ['...']
                    return PrettyPrinter._format(
                        self, cropped_obj, stream, indent,
                        allowance, context, level)
    
            # Let the original implementation handle anything else
            # Note: No use of super() because PrettyPrinter is an old-style class
            return PrettyPrinter._format(
                self, obj, stream, indent, allowance, context, level)
    
    
    p = CroppingPrettyPrinter(maxlist=3)
    p.pprint(obj)
    

    Output with maxlist=3:

    {'key_1': ['EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', '...'],
     'key_2': ['162LO154PM',
               '3ROAV881V2',
               [1, 2, ['a', 'b', 'c'], '...'],
               '...']}
    

    Output with maxlist=5 (triggers splitting the lists on separate lines):

    {'key_1': ['EG8XYD9FVN',
               'S2WARDCVAO',
               'J00YCU55DP',
               'R07BUIF2F7',
               'VGPS1JD0UM',
               '...'],
     'key_2': ['162LO154PM',
               '3ROAV881V2',
               'I4T79LP18J',
               'WBD36EM6QL',
               'DEIODVQU46',
               '...'],
     'key_3': ['162LO154PM',
               '3ROAV881V2',
               [1, 2, ['a', 'b', 'c'], 3, 4, '...'],
               'KWSJA5WDKQ',
               'WX9SVRFO0G',
               '...']}
    

    Notes:

    • This will create copies of lists. Depending on the size of the data structures, this can be very expensive in terms of memory use.
    • This only deals with the special case of lists. Equivalent behavior would have to be implemented for dicts, tuples, sets, frozensets, ... for this class to be of general use.