Search code examples
awkcygwin

How to exclude original $0 in this awk script?


Using Cygwin64 here.

Here's an extract of my file. Notice the product_id is not unique.

    <tr>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>
    <tr>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>

I want to make the product_id unique by concatentating the rownumber after QW.

The following awk script does what I need, but it also prints the original row below the new row. If I exclude {print $0}, then I only get the product_id rows.

awk '/LRZ/ {x=NR; print substr($0,1,33) x substr($0,34,12) x substr($0,46);} {print $0}' my_file.html

CURRENT RESULTS

    <tr>
    <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>
    <tr>
    <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>

DESIRED RESULTS

    <tr>
    <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
    <td>Crate</td>
    </tr>
    <tr>
    <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
    <td>Crate</td>
    </tr>

Solution

  • The next statement will keep awk from continuing to execute actions if you just want to move to the next line of input:

     $ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next} {print $0}' file
       <tr>
       <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
       <td>Crate</td>
       </tr>
       <tr>
       <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
       <td>Crate</td>
       </tr>
    

    Or if you prefer, you can simply negate the pattern for when you want to print the original line as is:

    $ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46)}
          $0 !~ /LRZ/ {print $0}' file
       <tr>
       <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
       <td>Crate</td>
       </tr>
       <tr>
       <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
       <td>Crate</td>
       </tr>
    

    Often this would be written more idiomatically as:

    $ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next}1' file
    

    using the next statement and the always-true pattern 1 whose default action is to print the original line.