Search code examples
unixsedcut

deleting repetitive columns in unix


I would like to delete multiple repetitive columns from a huge file (about 1 million). The columns that I want to delete has the same column names: A and others has different unique name. Say:

A B2 A B3

1.1 AA 1.2 AA

2.1 AB 4.3 CT

2.2 AC 6.4 GT

so column headers are A, B2, A, B3,... . How could I delete the columns named as A's from the data.


Solution

  • Another in awk:

    $ awk '
    NR==1 {
        split($0,a)
        for(i in a)
            if(a[i]=="A")
                delete a[i]
    }
    {
        for(i=1;i<=NF;i++)
            printf "%s",(i in a?$i OFS:"")
        printf ORS
    }' file
    B2 B3 
    AA AA 
    AB CT 
    AC GT