Search code examples
linuxshellawksed

Problems With Working With Columns, Changing Their Format


I have a file that contains a various number of lines to the likes of this.

05ALBUZZI             CLAUDIA MARIA       LBZCDM64M53F205R       236.41       197.01         6.70

My objective is, through mostly the use of awk and sed, to:

  • Remove the first two characters from the first column (so in this case, 05) and bringing them onto a column of it's own, before column one

  • Delimit all columns with ;, while keeping fields that contain spaces (for this example the name field on the second column has the name Claudia Maria) intact, and not dividing them into two columns

  • And finally, get all lines with agents that get a profit of more than 1500 (profits being displayed in the last three columns, profit is the sum of the three columns)

I've been trying to firstly select those first two characters out with the use of commands such as these:

awk 'FS="\t" { $0 = substr($1, 3) } 1' Agenti.txt
awk 'NR>1 {print $2}' Agenti.txt

But I've been getting mixed results. But since it's vital to get this out of the way for the rest of the excercise, I've been finding it basically impossible to proceed from here.


Solution

  • I see that one of your attempts:

    awk 'FS="\t" { $0 = substr($1, 3) } 1' Agenti.txt

    is trying to set the FS to a tab so I'm going to assume your input is tab-separated.

    Just walking through your objectives...

    Remove the first two characters from the first column (so in this case, 05) and bringing them onto a column of it's own, before column one

    $ cat tst.awk
    BEGIN { FS=OFS="\t" }
    {
        sub(/../,"&"FS)
        print
    }
    
    $ awk -f tst.awk Agenti.txt
    05      ALBUZZI CLAUDIA MARIA   LBZCDM64M53F205R        236.41  197.01  6.70
    

    Delimit all columns with ;, while keeping fields that contain spaces (for this example the name field on the second column has the name Claudia Maria) intact, and not dividing them into two columns

    $ cat tst.awk
    BEGIN { FS="\t"; OFS=";" }
    {
        sub(/../,"&"FS)
        $1 = $1
        print
    }
    
    $ awk -f tst.awk Agenti.txt
    05;ALBUZZI;CLAUDIA MARIA;LBZCDM64M53F205R;236.41;197.01;6.70
    

    And finally, get all lines with agents that get a profit of more than 1500 (profits being displayed in the last three columns, profit is the sum of the three columns)

    $ cat tst.awk
    BEGIN { FS="\t"; OFS=";" }
    {
        sub(/../,"&"FS)
        $1 = $1
        profit = $(NF-2) + $(NF-1) + $NF
    }
    profit > 1500
    
    $ awk -f tst.awk Agenti.txt
    $
    

    There's no output because you only provided 1 line of sample input and it's not a line that satisfies your criteria for printing.