Search code examples
bashifs

read -N and IFS


According to "read -N" description in manual page:

-N nchars return only after reading exactly NCHARS characters, unless EOF is encountered or read times out, ignoring any delimiter

However, in answer to following command:

$ echo 'a b' | while read -N1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>><<<
>>>b<<<
>>><<<

both, space and newline have been translated into empty string, while in the command:

$ echo 'a b' | while IFS= read -N1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>> <<<
>>>b<<<
>>>
<<<

space and newline have been stored correctly in the variable.

So, it seems delimiters still has some processing in "read" or "while" command, that I do not understand.

We could compare these results with the ones using "read -n", that manual described as:

-n nchars return after reading NCHARS characters rather than waiting for a newline, but honor a delimiter if fewer than NCHARS characters are read before the delimiter

$ echo 'a b' | while read -n1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>><<<
>>>b<<<
>>><<<

$ echo 'a b' | while IFS= read -n1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>> <<<
>>>b<<<
>>><<<

Solution

  • Using hexdump allows us to see exactly the characters making up the output, so it may be helpful to slightly change your queries:

    (1) With normal IFS and using -N option

    $ (echo 'a b' | while read -N1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
    00000000  61 3c 3c 62 3c 3c                                 |a<<b<<|
    00000006 
    

    In this first case, the read builtin for both 0x0a and the space character returns the empty string, as characters are in the default IFS and characters in the IFS are ignored in the output for the reason explained in cdarke's answer.

    (2) With empty IFS and -N option

    $ (IFS=""; echo 'a b' | while read -N1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
    00000000  61 3c 20 3c 62 3c 0a 3c                              |a< <b<.<|
    00000008
    

    In this case, the read builtin will match each of the four characters that the echo command outputs, and both 0x0a and a space are seen in the output, because with an empty IFS the characters read can be assigned to the local variable c.

    (3) With normal IFS and -n option

    $ (echo 'a b' | while read -n1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
    00000000  61 3c 3c 62 3c 3c                                 |a<<b<<|
    00000006 
    

    This gives just the same output as case (1), although the semantics are a bit different: the read builtin for both 0x0a and the space character return the empty string, as (i) both of these characters are in the default IFS and (ii) the -n option to the read builtin in any case does not pass on the trailing 0x0a character

    (4) With empty IFS and -n option

    $ (IFS=""; echo 'a b' | while read -n1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
    00000000  61 3c 20 3c 62 3c 3c                              |a< <b<<|
    00000007
    

    Here we observe a difference between the -n and -N options to read: with the -n option, the newline is treated specially by the read builtin and dropped, hence the exclusion of 0x0a from IFS doesn't have an opportunity to allow it to be passed to the local variable c.