I have this file.csv
"201707"|"51976551"|1|0|1|"20170702"
"201707"|"51955194"|1|0|0|"20170702"
"201707"|"51923555"|1|0|1|"20170702"
"201707"|"51976551"|1|0|1|"20170703"
"201707"|"51955194"|1|0|0|"20170703"
"201707"|"51923555"|1|0|1|"20170703"
"201707"|"51960597"|1|0|0|"20170703"
And my hope result is group by the number and sum the column 3, 4 and 5
"201707"|"51976551"|2|0|2
"201707"|"51955194"|2|0|0
"201707"|"51923555"|2|0|2
"201707"|"51960597"|1|0|0
I've tried with:
cat file.csv | awk -F"|" '
{ a[$2] += $3 }
END {
for (i in a) {
printf "%s|%s\n", i, a[i];
}
}
'
And the result is:
"51976551"|2
"51955194"|2
"51923555"|2
"51960597"|1
Only shows the sum of third column, but I need 2 columns more. what should I do in this case?
Try:
$ awk -F"|" '{ a[$1 OFS $2]+=$3; b[$1 OFS $2]+=$4; c[$1 OFS $2]+=$5 }
END {
for (i in a) {
print i, a[i], b[i], c[i];
}
}
' OFS=\| file.csv
"201707"|"51976551"|2|0|2
"201707"|"51960597"|1|0|0
"201707"|"51923555"|2|0|2
"201707"|"51955194"|2|0|0
-F"|"
This sets the field separator on input to |
.
a[$1 OFS $2]+=$3; b[$1 OFS $2]+=$4; c[$1 OFS $2]+=$5
This keeps track of the totals of the third, fourth, and fifth columns.
END {
for (i in a) {
print i, a[i], b[i], c[i];
}
}
This prints out the results.
OFS=\|
This tells awk to use |
as the field separator on output.