Search code examples
perlawkcut

Extracting and matching columns in text file


I have a text file with the following structure. I want to remove first part before comma and keep rest and then match them to the 2nd column and put them in individual rows.

INPUT:

A,B,C       London
G,L,K,I     Berlin
Q,O,M,J     Madrid

I want a output like this:

OUTPUT:

B  London
C  London
L  Berlin
K  Berlin
I  Berlin
O  Madrid
M  Madrid
J  Madrid

Solution

  • This can be a way with awk:

    $ awk '{n=split($1, a, ","); for (i=2; i<=n; i++) print a[i], $NF}' file
    B London
    C London
    L Berlin
    K Berlin
    I Berlin
    O Madrid
    M Madrid
    J Madrid
    

    Explanation

    • n=split($1, a, ",") slices the first field into pieces based on comma as delimiter. split returns the number of slices and we store that.
    • for (i=2; i<=n; i++) print a[i], $NF we then loop through all these slices, starting from the 2nd one, printing each one together with the last field (city name).