Using Cygwin64 here.
Here's an extract of my file. Notice the product_id is not unique.
<tr>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
I want to make the product_id unique by concatentating the rownumber after QW.
The following awk script does what I need, but it also prints the original row
below the new row. If I exclude {print $0}
, then I only get the product_id rows.
awk '/LRZ/ {x=NR; print substr($0,1,33) x substr($0,34,12) x substr($0,46);} {print $0}' my_file.html
CURRENT RESULTS
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
DESIRED RESULTS
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
The next
statement will keep awk from continuing to execute actions if you just want to move to the next line of input:
$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next} {print $0}' file
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
Or if you prefer, you can simply negate the pattern for when you want to print the original line as is:
$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46)}
$0 !~ /LRZ/ {print $0}' file
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
Often this would be written more idiomatically as:
$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next}1' file
using the next
statement and the always-true pattern 1
whose default action is to print the original line.