Search code examples
bashawksedmultiple-columns

Bash turning single comma-separated column into multi-line string


In my input file, columns are tab-separated, and the values inside each column are comma-separated.

I want to print the first column with each comma separated value from the second.

Mary,Tom,David   cat,dog
Kevin   bird,rabbit
John    cat,bird
...

For each record in the second column ( eg cat,dog ) i want to split record into array of [ cat, dog ] and cross print this against the first column. giving output ( just for this line )

Mary,Tom,David   cat
Mary,Tom,David   dog

output for whole file should be be:

Mary,Tom,David   cat
Mary,Tom,David   dog
Kevin   bird
Kevin   rabbit
John    cat
John    bird
...

any suggestions if i want to use awk or sed?


Solution

  • With awk

    awk '{split($2,a,",");for(i in a)print $1"\t"a[i]}' file
    

    Splits the second column on commas and then for each split value, print the first column and that value

    Also in sed

    sed ':1;s/\(\([^\n]*\t\)[^\n]*\),\{1,\}/\1\n\2/;t1' file