Search code examples
pythonpython-3.xhexpprint

pprint with hex numbers


I work with a number of json-like dicts. pprint is handy for structuring them. Is there a way to cause all ints in a pprint output to be printed in hex rather than decimal?

For example, rather than:

{66: 'far',
 99: 'Bottles of the beer on the wall',
 '12': 4277009102,
 'boo': 21,
 'pprint': [16, 32, 48, 64, 80, 96, 112, 128]}

I'd rather see:

{0x42: 'far',
 0x63: 'Bottles of the beer on the wall',
 '12': 0xFEEDFACE,
 'boo': 0x15,
 'pprint': [0x10, 0x20, 0x30, 0x40, 0x50, 0x60, 0x70, 0x80]}

I have tried customizing PrettyPrinter, but to no avail, was I able to cause the above, having PrettyPrinter.format() handle integers only seems to work for some of the integers:

class MyPrettyPrinter(PrettyPrinter):
    def format(self, object, context, maxlevels, level):
        if isinstance(object, int):
            return '0x{:X}'.format(object), True, False
        return super().format(object, context, maxlevels, level)

the above class produces

{0x42: 'far',
 0x63: 'Bottles of the beer on the wall',
 '12': 0xFEEDFACE,
 'boo': 0x15,
 'pprint': [16, 32, 48, 64, 80, 96, 112, 128]}

The list contents are not correctly formatted.


Solution

  • You can alter the output of pprint, but you need to re-implement the saferepr() function, not just subclass the pprint.PrettyPrinter() class.

    What happens is that (an internal version of) the saferepr() function is used for all objects, and that function itself then recursively handles turning objects into representations (using only itself, not the PrettyPrinter() instance), so any customisation has to happen there. Only when the result of saferepr() becomes too large (too wide for the configured width) will the PrettyPrinter class start breaking up container output into components to put on separate lines; the process of calling saferepr() is then repeated for the component elements.

    So PrettyPrinter.format() is only responsible for handling the top-level object, and every recursive object that is a) inside a supported container type (dict, list, tuple, string and the standard library subclasses of these) and b) where the representation of the parent container produced by .format() exceeded the display width.

    To be able to override the implementation, we need to understand how the .format() method and the saferepr() implementation interact, what arguments they take and what they need to return.

    PrettyPrinter.format() is passed additional arguments, context, maxlevels and level:

    • context is used to detect recursion (the default implementation returns the result of _recursion(object) if id(object) in context is true.
    • when maxlevels is set and level >= maxlevels is true, the default implementation returns ... as the contents of a container.

    The method is also supposed to return a tuple of 3 values; the representation string and two flags. You can safely ignore the meaning of those flags, they are actually never used in the current implementation. They are meant to signal if the produced representation is 'readable' (uses Python syntax that can be passed to eval()) or was recursive (the object contained circular references). But the PrettyPrinter.isreadable() and PrettyPrinter.isrecursive() methodsactually completely bypass .format(); these return values seem to be a hold-over from a refactoring that broke the relationship between .format() and those two methods. So just return a representation string and whatever two boolean values you like.

    .format() really just delegates to an internal implementation of saferepr() that then does several things

    • handle recursion detection with context, and depth handling for maxlevels and level
    • recurse over dictionaries, lists and tuples (and their subclasses, as long as their __repr__ method is still the default implementation)
    • for dictionaries, sort the key-value pairs. This is trickier than it appears in Python 3, but this is solved with a custom _safe_tuple sorting key that approximates Python 2's sort everything behaviour. We can re-use this.

    To implement a recursive replacement, I prefer to use @functools.singledispatch() to delegate handling of different types. Ignoring custom __repr__ methods, handling depth issues, recursion, and empty objects, can also be handled by a decorator:

    import pprint
    from pprint import PrettyPrinter
    from functools import singledispatch, wraps
    from typing import get_type_hints
    
    def common_container_checks(f):
        type_ = get_type_hints(f)['object']
        base_impl = type_.__repr__
        empty_repr = repr(type_())   # {}, [], ()
        too_deep_repr = f'{empty_repr[0]}...{empty_repr[-1]}'  # {...}, [...], (...)
        @wraps(f)
        def wrapper(object, context, maxlevels, level):
            if type(object).__repr__ is not base_impl:  # subclassed repr
                return repr(object)
            if not object:                              # empty, short-circuit
                return empty_repr
            if maxlevels and level >= maxlevels:        # exceeding the max depth
                return too_deep_repr
            oid = id(object)
            if oid in context:                          # self-reference
                return pprint._recursion(object)
            context[oid] = 1
            result = f(object, context, maxlevels, level)
            del context[oid]
            return result
        return wrapper
    
    @singledispatch
    def saferepr(object, context, maxlevels, level):
        return repr(object)
    
    @saferepr.register
    def _handle_int(object: int, *args):
        # uppercase hexadecimal representation with 0x prefix
        return f'0x{object:X}'
    
    @saferepr.register
    @common_container_checks
    def _handle_dict(object: dict, context, maxlevels, level):
        level += 1
        contents = [
            f'{saferepr(k, context, maxlevels, level)}: '
            f'{saferepr(v, context, maxlevels, level)}'
            for k, v in sorted(object.items(), key=pprint._safe_tuple)
        ]
        return f'{{{", ".join(contents)}}}'
    
    @saferepr.register
    @common_container_checks
    def _handle_list(object: list, context, maxlevels, level):
        level += 1
        contents = [
            f'{saferepr(v, context, maxlevels, level)}'
            for v in object
        ]
        return f'[{", ".join(contents)}]'
    
    @saferepr.register
    @common_container_checks
    def _handle_tuple(object: tuple, context, maxlevels, level):
        level += 1
        if len(object) == 1:
            return f'({saferepr(object[0], context, maxlevels, level)},)'
        contents = [
            f'{saferepr(v, context, maxlevels, level)}'
            for v in object
        ]
        return f'({", ".join(contents)})'
    
    class HexIntPrettyPrinter(PrettyPrinter):
        def format(self, *args):
            # it doesn't matter what the boolean values are here
            return saferepr(*args), True, False
    

    This hand-full can handle anything the base pprint implementation can, and it will produce hex integers in any supported container. Just create an instance of the HexIntPrettyPrinter() class and call .pprint() on that:

    >>> sample = {66: 'far',
    ...  99: 'Bottles of the beer on the wall',
    ...  '12': 4277009102,
    ...  'boo': 21,
    ...  'pprint': [16, 32, 48, 64, 80, 96, 112, 128]}
    >>> pprinter = HexIntPrettyPrinter()
    >>> pprinter.pprint(sample)
    {0x42: 'far',
     0x63: 'Bottles of the beer on the wall',
     '12': 0xFEEDFACE,
     'boo': 0x15,
     'pprint': [0x10, 0x20, 0x30, 0x40, 0x50, 0x60, 0x70, 0x80]}
    

    Side note: if you are using Python 3.6 or older you'll have to replace the @saferepr.registration lines with @saferepr.registration(<type>) calls, where <type> duplicates the type annotation on the first argument of the registered function.