Search code examples
bashbinaryhexend-of-linexxd

confusion about ASCII linefeed byte in awk + xxd bash command


I am confused about some 0a (i.e. NL ASCII byte) happening in some bash commands. On the following:

$ echo | sha1sum $1 | awk '{print $1;}' | xxd -r -ps > test.bin
$ echo | sha1sum $1 | awk '{print $1;}' > test.hex
$ xxd test.bin
00000000: adc8 3b19 e793 491b 1c6e a0fd 8b46 cd9f  ..;...I..n...F..
00000010: 32e5 92fc                                2...
$ xxd test.hex
00000000: 6164 6338 3362 3139 6537 3933 3439 3162  adc83b19e793491b
00000010: 3163 3665 6130 6664 3862 3436 6364 3966  1c6ea0fd8b46cd9f
00000020: 3332 6535 3932 6663 0a                   32e592fc.

what is responsible for the 0a byte to be present in test.hex but not in test.bin?

Note 1: this is a question that I have been asking myself following the solution used there:

Dump a ```sha``` checksum output to disk in binary format instead of plaintext hex in bash

Note 2: I am able to suppress the 0a byte, this is not the question, I am just curious of why it is present in one case but not the other:

$ echo | sha1sum $1 | awk '{print $1;}' | head -c-1 > test_2.hex
$ xxd test_2.hex
00000000: 6164 6338 3362 3139 6537 3933 3439 3162  adc83b19e793491b
00000010: 3163 3665 6130 6664 3862 3436 6364 3966  1c6ea0fd8b46cd9f
00000020: 3332 6535 3932 6663                      32e592fc

Solution

  • The 0a that you are seeing is coming from awk. By default the output record separator for awk is \n and you can remote it by setting the ORS (e.g. with a BEGIN {ORS=""}).

    You lose it when you pipe through xxd -r -ps due to the -r parameter. From the man page: "Additional Whitespace and line-breaks are allowed anywhere."