I am trying to parse 50+ files in a shell script in a single call like the following,
for i in {0..49} do _file_list="$_file_list $_srcdir01/${_date_a[$i]}.gz" done eval zcat "$_file_list" | awk '{sum += 1} END {print sum;}'
But when I do this, I get the 'file name too long' error with zcat.
The reason I am trying to do this in a single call is because to my knowledge, awk cannot retain information from previous call. And I have to go through the entire list by considering it as a whole (e.g. finding a unique word in that list)
I also don't want to combine files because each of them are large files already.
Is there a clever way to solve this or Do I need to split the call and write out the intermediate results along the way?
You can pipe directly from a loop:
for date in "${_date_a[@]}"
do
zcat "$_srcdir01/$date.gz"
done | awk '{sum += 1} END {print sum;}'
In any case, that code shouldn't give that error as posted.
Since your example is not complete or self-contained, I added some code to initialize datafiles to test:
$ cat testscript
_srcdir01="./././././././././././././././././././"
_date_a=(foo{0001..0050})
for file in "${_date_a[@]}"
do
echo "hello world" | gzip > "$file.gz"
done
for i in {0..49}
do
_file_list="$_file_list $_srcdir01/${_date_a[$i]}.gz"
done
eval zcat "$_file_list" | awk '{sum += 1} END {print sum;}'
Running it generates a bunch of test data and correctly sums the number of lines:
$ bash testscript
50
I can reproduce your issue if I e.g. remove the eval
:
$ bash testscript
(...)/foo0045.gz ./././././././././././././././././././/foo0046.gz ././././././.
/././././././././././././/foo0047.gz ./././././././././././././././././././/foo0
048.gz ./././././././././././././././././././/foo0049.gz ./././././././././././.
/./././././././/foo0050.gz: file name too long
So please double check that the code you post is the code you run, and not one of several other attempts you made while trying to solve it.