Search code examples
unixcut

How exacly do you use the -z option for the "cut" command?


In the documentation it says that the -z option changes the default line delimiter, which is a newline character, to ASCII NUL. How exactly do you use that option? Take a look at this example please (the file is tab-separated):

$ cat data.tsv 
John Doe    28  New York
Bob Smith   37  Boston
Jane Doe    31  Boston
$
$ cut -f1,3 data.tsv 
John Doe    New York
Bob Smith   Boston
Jane Doe    Boston
$
$ # This is the output I get:
$ cut -f1,3 -z data.tsv 
John Doe    New York
Bob Smith$

I can't make sense of that output. What exactly is going on there?


Solution

  • -z is for when your input is NUL-delimited instead of \n-delimited. It changes how cut processes the the input, and also makes it output records in the same format, with \0 line endings.

    Your input data now is:

    John Doe    28  New York\nBob Smith   37  Boston\nJane Doe    31  Boston\n
    

    If it had NULs instead of newlines, you'd use -z:

    John Doe    28  New York\0Bob Smith   37  Boston\0Jane Doe    31  Boston\0
    

    When would this be useful? It's not for files as much as it is for pipelines. For instance, you could use find -print0 to output file names with \0 after each name. find -print0 allows you to process file names with embedded newlines—it is highly unusual, but newlines are legal characters in file names. \0 is never legal.

    Then cut -z would be useful.

    Similar flags in other commands include xargs -0, read -d '', and cpio -0.