Search code examples
linuxfileawkfilenames

How can I get past file length limit?


I am trying to parse 50+ files in a shell script in a single call like the following,

for i in {0..49}
do
    _file_list="$_file_list $_srcdir01/${_date_a[$i]}.gz"
done
eval zcat "$_file_list" | awk '{sum += 1} END {print sum;}'

But when I do this, I get the 'file name too long' error with zcat.

The reason I am trying to do this in a single call is because to my knowledge, awk cannot retain information from previous call. And I have to go through the entire list by considering it as a whole (e.g. finding a unique word in that list)

I also don't want to combine files because each of them are large files already.

Is there a clever way to solve this or Do I need to split the call and write out the intermediate results along the way?


Solution

  • You can pipe directly from a loop:

    for date in "${_date_a[@]}"
    do
      zcat "$_srcdir01/$date.gz"
    done | awk '{sum += 1} END {print sum;}'
    

    In any case, that code shouldn't give that error as posted.

    Since your example is not complete or self-contained, I added some code to initialize datafiles to test:

    $ cat testscript
    _srcdir01="./././././././././././././././././././"
    _date_a=(foo{0001..0050})
    for file in "${_date_a[@]}"
    do
      echo "hello world" | gzip > "$file.gz"
    done
    
    for i in {0..49}
    do
        _file_list="$_file_list $_srcdir01/${_date_a[$i]}.gz"
    done
    eval zcat "$_file_list" | awk '{sum += 1} END {print sum;}'
    

    Running it generates a bunch of test data and correctly sums the number of lines:

    $ bash testscript
    50
    

    I can reproduce your issue if I e.g. remove the eval:

    $ bash testscript
    (...)/foo0045.gz ./././././././././././././././././././/foo0046.gz ././././././.
    /././././././././././././/foo0047.gz ./././././././././././././././././././/foo0
    048.gz ./././././././././././././././././././/foo0049.gz ./././././././././././.
    /./././././././/foo0050.gz: file name too long
    

    So please double check that the code you post is the code you run, and not one of several other attempts you made while trying to solve it.