Search code examples
pythoncode-golf

Munging non-printable characters to dots using string.translate()


So I've done this before and it's a surprising ugly bit of code for such a seemingly simple task.

The goal is to translate any non-printable character into a . (dot). For my purposes "printable" does exclude the last few characters from string.printable (new-lines, tabs, and so on). This is for printing things like the old MS-DOS debug "hex dump" format ... or anything similar to that (where additional whitespace will mangle the intended dump layout).

I know I can use string.translate() and, to use that, I need a translation table. So I use string.maketrans() for that. Here's the best I could come up with:

filter = string.maketrans(
   string.translate(string.maketrans('',''),
   string.maketrans('',''),string.printable[:-5]),
   '.'*len(string.translate(string.maketrans('',''),
   string.maketrans('',''),string.printable[:-5])))

... which is an unreadable mess (though it does work).

From there you can call use something like:

for each_line in sometext:
    print string.translate(each_line, filter)

... and be happy. (So long as you don't look under the hood).

Now it is more readable if I break that horrid expression into separate statements:

ascii = string.maketrans('','')   # The whole ASCII character set
nonprintable = string.translate(ascii, ascii, string.printable[:-5])  # Optional delchars argument
filter = string.maketrans(nonprintable, '.' * len(nonprintable))

And it's tempting to do that just for legibility.

However, I keep thinking there has to be a more elegant way to express this!


Solution

  • Here's another approach using a list comprehension:

    filter = ''.join([['.', chr(x)][chr(x) in string.printable[:-5]] for x in xrange(256)])