Search code examples
asciiextended-ascii

What's an OSX command line tool or simple app to remove extended ASCII characters from source text files?


I've been cutting and pasting some code snippets from an Amazon Kindle eBook into a text editor (JetBrains PhpStorm), and apparently each time it comes with some extended (>127) ASCII characters.

Is there simple cmd line sed/awk/tr command, or a simple OSX App to strip them out?


Solution

  • Thanks to this blog post, here is a solution that worked well for me:

    tr -cd '\11\12\15\40-\176' < infile > outfile
    

    Note that if you get this error: tr: Illegal byte sequence, this can be solved by setting LANG=C via:

    export LANG=C
    

    (not sure why setting LANG=C helps, but that's what others with the same problem were doing)