Search code examples
bashawktabular

Script to bring together all relevant accession into its root access


I have a big data following;

KEL_1021159,K00001
KEL_1020176,K00001
KEL_1018609,K00001
KEL_1008140,K00006
KEL_1012058,K00006
KEL_1018645,K00006
KEL_1004034,K00006
KEL_1004235,K00006

and I am currently trying to convert it to like this;

KEL_1021159,KEL_1020176,KEL_1018609 K00001
KEL_1008140,KEL_1012058,KEL_1018645,KEL_1004034,KEL_1004235 K00006

Is there any basic script for such a purpose?


Solution

  • Pretty straightforward with awk:

    awk -F, '
        {a[$2] = a[$2] $1 FS}
        END {
            for (key in a) {
                sub(/,$/, "", a[key])
                print a[key], key
            }
        }
    ' file
    

    Opaque perl one-liner:

    perl -F, -ane 'push @{$a{$F[1]}},$F[0]}{print join(",",@{$a{$_}})." ".$_ for keys %a' file