I have bunch of record wise formatted (.csv)files. First field is an integer or may be empty as well. Its true for all the files. I want to count number of records whose first field is empty in each file and then want to plot count graph over all the files.
File format of filename.csv:
123456,few,other,fields
,few,other,fields
234567,few,other,fields
I want something like
awk -F, '$1==""' `ls` | (for each file separately wc -l) | gnugraph ( y axis as output of wc -l command and x axis as simply 1 to n where n is number of csv files)
The problem I am facing is wc -l
gets executed only once for all the files together. I want to run wc -l
for each file and count the number of records having empty first field and provide this sequence of count to the gnugraph
command.
once I get required count for each file I am almost done as
seq 10 | gnuplot -p -e "plot '<cat'"
works fine
You could use awk
to keep track of the count for each file in an array. Then at the end print the contents of the array:
awk '$1==""{a[FILENAME]+=1} END{for(file in a) { print file, a[file] }}' `ls`
This way you don't have to tangle with wc
and just shoot the contents right over to gnuplot
Example in use:
$> cat file1
,test
2,test
3,
$> cat file2
,test
2,test
3,
,test
$> awk -F"," '$1==""{a[FILENAME]+=1} END{for(file in a) { print file, a[file] }}' `ls`
file1 1
file2 2