Search code examples
fish

Shell: input for pdfinfo


On a fish shell I write

ls -1t|head -1 |xargs pdfinfo

which should basically give me the latest modified file (which is a PDF) and then print the PDFinfo of this file. But somehow it fails with the error

Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

I get the same results with bash. Any ideas what I need to adapt to get the command running?


Solution

  • This issue is most likely down to xargs mangeling its input.

    By default, it will perform bash-like word splitting, including quotes.

    So for instance

    echo "foo 'bar baz'" | xargs printf '<%s>\n'
    

    Will print <foo> and <bar baz>, as it will execute like

    printf '<%s>\n' foo 'bar baz'
    

    That means if you have a filename with a space or quote, it will execute the wrong thing. Assuming "ls -1t | head -n 1", which prints the newest file, comes up with "foo bar.pdf", then it would execute pdfinfo like

    pdfinfo foo bar.pdf
    

    handing it two separate filenames, "foo" and "bar.pdf".

    In this case, the simplest solution is to just skip xargs entirely:

    pdfinfo (ls -1t | head -n 1)
    

    You can also skip the head by using fish's slicing:

    pdfinfo (ls -1t)[1]
    

    which will only take the first line of ls -1ts output. Note that fish will still split on newlines (unlike bash it won't split on spaces or tabs), so if you wanted to handle filenames with newlines (which can typically happen on unix!) you would have to do it differently - the ls output is ambiguous if filenames can include newlines. Possibilities include find with -exec.

    If you have to use xargs, you can use the -0 option that is available in some xargs implementations, together with fish's string join0 to add a NUL-byte:

     ls -1t | head -n1 | string join0 | xargs -0 pdfinfo
    

    This will tell xargs to inhibit its word splitting and instead read arguments NUL-terminated (which is unambiguous because UNIX filenames and commandline arguments in general can't include NUL - because it passes them as NUL-terminated strings).