Need some awk-fu! I have the following data structure:
ybcxl,05,06,07,08,09,10,11
yxxu,01
yxxu,03,05,06,07,08,09,10,11
ybban,01,03,04,05,06,07,08
zxvhu,01,02,03,04,05,06,07,08,09,10,11,12
Each line should have 13 columns. Starting at Col 2, I need to asses the value. The value should always be the column number minus 1. In other words, Column 2 should have a value of 1. I need to create columns (by inserting commas) so that each line has the appropriate number of columns with the existing values ordered correctly (col number minus 1 with an offset of the 1st record).
Example of corrected data:
yxxu,01,,,,,,,,,,
yxxu,,,03,,05,06,07,08,09,10,11,
$ cat af.txt
ybcxl,05,06,07,08,09,10,11
yxxu,01
yxxu,03,05,06,07,08,09,10,11
ybban,01,03,04,05,06,07,08
zxvhu,01,02,03,04,05,06,07,08,09,10,11,12
$ cat af.awk
BEGIN { FS="," }
{
printf "%s", $1
for (i=2; i<=NF; ++i) a[$i+0] = $i
for (i=1; i<=12; ++i) printf ",%s", (i in a ? a[i] : "")
print ""
delete a
}
$ awk -f af.awk af.txt
ybcxl,,,,,05,06,07,08,09,10,11,
yxxu,01,,,,,,,,,,,
yxxu,,,03,,05,06,07,08,09,10,11,
ybban,01,,03,04,05,06,07,08,,,,
zxvhu,01,02,03,04,05,06,07,08,09,10,11,12
Test to make sure that all the rows have 13 columns:
$ awk -f af.awk af.txt | awk -F, '{ print NF }'
13
13
13
13
13
Test to make sure column numbers are one more than their corresponding field value (when present):
$ awk -f af.awk af.txt | awk -F, '{ error = 0; for(i=2;i<=13;++i) { if ($i && i != ($i + 1)) error = 1 } print (error ? "ERROR!" : "GOOD!")}'
GOOD!
GOOD!
GOOD!
GOOD!
GOOD!
Explanation:
BEGIN { FS="," }
Set field separator to comma
printf "%s", $1
Output the first field (using printf
to avoid a newline)
for (i=2; i<=NF; ++i) a[$i+0] = $i
Store all the numbers present in the record into an array, using their numeric
equivalents as the indices ($i+0
converts string into number)
for (i=1; i<=12; ++i) printf ",%s", (i in a ? a[i] : "")
For each field we want to output, if the number is in the array print it,
otherwise, just print the comma
print ""
Print a newline to finish the output record
delete a
Clear the array before reading the next input record