Search code examples
bashxargs

Concatenate URL in bash with xargs


I'm trying to build URLs from output with one entry per line. I have tried this:

<stuff> | xargs -L1 -I {} echo "${url}&page=queryresults&j="{}

However, for some long lines (they don't have space but can have dashes and underscores), I get '{}' where I would expect the string that's generated by <stuff>. If I add a space between the final double quote and {} it works, but I have an extra space I don't want:

<stuff> | xargs -L1 -I {} echo "${url}&page=queryresults&j=" {}

Similarly, if I remove the &page=queryresults bit, it works. I have no idea why.

What am I missing here?

It works for this:

blajob_123abcd_1234567890x

But not this: SomeTask_some_long_project_name_with_cumulative_metrics_YYYYMMDD_2018_08_15T00_12345a67b8-scheduled-run-bla-bla-bla-yadda


Solution

  • There's no need for xargs here at all, and you're better off without it. The following is guaranteed to work correctly on all POSIX-compliant shells:

    while IFS= read -r line; do
      printf '%s&page=queryresults&j=%s\n' "$url" "$line"
    done
    

    Why not stick with xargs -I {} echo "$url&...&j={}"?

    • xargs -I's specification includes the following text: Constructed arguments cannot grow larger than 255 bytes. If your URL is long, this could result in truncation -- which appears to match with the details described.
    • xargs -I is only included in XSI extensions to POSIX; platforms which do not claim to implement those extensions are not required to provide it, or if they do, to have it behave in any particular way.
    • If you used xargs printf "$url..." (substituting the URL into the format string rather than through a placeholder), you would have bugs if your URL contained % signs.
    • If you used echo, you would have unspecified behavior if your URL contained literal backslashes (see the APPLICATION USAGE section of the POSIX specification for echo).

    That said, if you really want to use xargs, consider (on GNU systems):

    xargs -d $'\n' printf "${url//%/%%}"'&page=queryresults&j=%s\n'
    

    ...or, on a platform with BSD tools:

    tr '\n' '\0' | xargs -0 printf "${url//%/%%}"'&page=queryresults&j=%s\n'
    

    Note:

    • Because we aren't using -I, the 255 character limit doesn't apply at all. (Similarly, xargs is able to pass as many arguments to each instance of /usr/bin/printf as will fit onto its command line, rather than being limited to one argument per call).
    • In the URL, we replace any % literals with %%. If the URL is already correctly encoded, it shouldn't contain any backslashes (they should already have been replaced with %5C).
    • The GNU extension -d is being used to specify that only newlines should be treated as delimiters between words to be treated as arguments; this also prevents literal quotes from being parsed and consumed by xargs itself. On BSD platforms, converting newlines to NULs and using -0 serves as a substitute.