htmlpowershellpowershell-4.0

Regex - Extract third number from string using powershell


This is a one-liner fixed string from HTTP respond. The numbers in this string are dynamic:

<tr><td>Sum</td><td>10</td><td>0</td><td>132</td><td> </td><td>35</td><td>465</td><td>0</td><td>56</td><td>42</td></tr>

I need to extract the third number (in this case 132) with regex.
Can someone explain how to do it?


Solution

  • The conceptually simplest approach is to use the [regex]::Matches() .NET method with a simple \d+ regex to find all (non-negative decimal) numbers in the input string, and use indexing to get the match of interest ([2] returns the 3rd match):

    $str = '<tr><td>Sum</td><td>10</td><td>0</td><td>132</td><td> </td><td>35</td><td>465</td><td>0</td><td>56</td><td>42</td></tr>'
    
    # -> '132'
    [regex]::Matches(
      $str,
      '\d+'
    ).Value[2]
    

    If you're willing to use a more complex regex, you can use the -replace operator:

    # -> '132'
    $str -replace '^.+?\d+.+?\d+.+?(\d+).*$', '$1'
    

    For an explanation of the regex and the ability to experiment with it, see this regex101.com page.