Search code examples
linuxsedtr

tr equivalent Linux Command to replace a string without editing the file


tr is a command to replace a single character but I am looking for a command line solution to replace \001 with | in a log file for better readability of the line. I dont want to edit the file with a command which modifies the log file.

input: FIX.4.2\0019=64\00135=0\00134=363

output intended: FIX.4.2|9=64|35=0|34=363


Solution

  • Using sed is the obvious way to do the job. It doesn't overwrite the input file unless you tell it to do so.

    I suppose you can use awk; it will be (a little) harder to write than using sed. You can use Perl or Python too. But sed has the 'replace string without editing file' written into its job title — the overwrite stuff is a recent addition to non-standard versions of sed (recent meaning since about 1990).

    This command should do the job — and it is compact notation too!

    sed 's/\\001/|/g'       
    

    but my understanding is it replaces the existing log file.

    No — sed does not replace the existing log file. Both GNU sed and BSD (macOS) sed have the -i flag to allow you to overwrite the input file(s) — but the semantics of their -i flags are slightly different. However, by default, sed edits the stream of input (its name is an abbreviation of 'stream editor') and writes to standard output. You have to tell it to overwrite the input.

    The awk equivalent is longer than the sed:

    awk '{gsub(/\\001/, "|"); print}'
    

    which (if you prefer brevity to clarity) could be reduced to:

    awk '{gsub(/\\001/, "|")} 1'
    

    The Perl is pretty compact too:

    perl -pe 's/\\001/|/g'
    

    For completeness, a Python 2.7 script for this job could be:

    #!/usr/bin/env python
    
    import re
    import fileinput
    
    match = re.compile(r'\\001')
    
    for line in fileinput.input():
        line = re.sub(match, '|', line)
        print line,
    

    (See fileinput, re and print for syntax details — the trailing comma does matter.)

    The equivalent in Python 3.x is very similar:

    #!/usr/bin/env python3
    
    import re
    import fileinput
    
    match = re.compile(r'\\001')
    
    for line in fileinput.input():
        line = re.sub(match, '|', line)
        print(line, end='')
    

    (See fileinput, re and print for more information. Note that print() versus print is one of the biggest differences between Python 2 and Python 3.)

    There may be better ways to write the Python code. It seems to work, though.