Search code examples
phpregexpreg-matchpreg-match-all

Why isn't this pregmatch not working?


I have this string and I am trying to find a match in which, a row has the port number, the field is always either an integer, or a link which contains the word port- in it. An advice of why my regex statement didn't work will be appreciated.

The php expression I am using is:

preg_match_all("/<td align=\"left\">\s*(<a href=\"http:\/\/www.proxynova.com\/proxy-server-list\/port-\d*|\d*)?\s*<\/td>/s", $input_lines, $output_array);

The Live Regex Site where I have my work: http://www.phpliveregex.com/p/eX8

The String I am looking is as follows:

<table width="950" class="table" id="tbl_proxy_list">
<thead>
    <tr>
      <th>Proxy IP</th>
      <th>Proxy Port</th>
      <th>Last Check</th>
      <th nowrap="nowrap"><span title="Proxy Speed in bytes per second">Proxy Speed</span></th>
      <th>Uptime</th>
      <th><span title="The location of that particular proxy.">Proxy Country</span></th>
      <th>Anonymity </th>
    </tr>
</thead>
<tbody>
    <tr>
        <td align="left">         
            <span class="row_proxy_ip">220.225.87.129</span>
        </td>

        <td align="left">
            <a href="http://www.proxynova.com/proxy-server-list/port-8080">8080</a>
        </td>
        <td align="left">
            <time class="icon icon-check timeago" datetime="2016-03-14 12:58:53Z"></time>
        </td>         
        <td align="left">
            <div class="progress-bar" data-value="7.7100404" title="5855.0202"></div>
        </td>
        <td style="text-align:center !important;">
            <span style="color:#009900;">59%</span>
        </td>
        <td align="left">
            <img src="//www.proxynova.com/assets/images/blank.gif" class="flag flag-in" width="15" height="11" alt="IN" />
            <a href="/proxy-server-list/country-in/">India        
                <span class="proxy-city"> - Chandannagar </span> 
            </a>
        </td>
        <td align="left">
            <span class="proxy_transparent" style="font-weight:bold; font-size:10px;">Transparent</span>
        </td>
    </tr>
    <tr>
        <td align="left">         
            <span class="row_proxy_ip">220.225.87.129</span>
        </td>

        <td align="left">
            <a href="http://www.proxynova.com/proxy-server-list/port-8080">8080</a>
        </td>
        <td align="left">
            <time class="icon icon-check timeago" datetime="2016-03-14 12:58:53Z"></time>
        </td>         
        <td align="left">
            <div class="progress-bar" data-value="7.7100404" title="5855.0202"></div>
        </td>
        <td style="text-align:center !important;">
            <span style="color:#009900;">59%</span>
        </td>
        <td align="left">
            <img src="//www.proxynova.com/assets/images/blank.gif" class="flag flag-in" width="15" height="11" alt="IN" />
            <a href="/proxy-server-list/country-in/">India        
                <span class="proxy-city"> - Chandannagar </span> 
            </a>
        </td>
        <td align="left">
            <span class="proxy_transparent" style="font-weight:bold; font-size:10px;">Transparent</span>
        </td>
    </tr>
    <tr>
        <td align="left">         
            <span class="row_proxy_ip">220.225.87.129</span>
        </td>

        <td align="left">
            80
        </td>
        <td align="left">
            <time class="icon icon-check timeago" datetime="2016-03-14 12:58:53Z"></time>
        </td>         
        <td align="left">
            <div class="progress-bar" data-value="7.7100404" title="5855.0202"></div>
        </td>
        <td style="text-align:center !important;">
            <span style="color:#009900;">59%</span>
        </td>
        <td align="left">
            <img src="//www.proxynova.com/assets/images/blank.gif" class="flag flag-in" width="15" height="11" alt="IN" />
            <a href="/proxy-server-list/country-in/">India        
                <span class="proxy-city"> - Chandannagar </span> 
            </a>
        </td>
        <td align="left">
            <span class="proxy_transparent" style="font-weight:bold; font-size:10px;">Transparent</span>
        </td>
    </tr>
</tbody>
</table>

Solution

  • Try this -

    <td align=\"left\">\s*(<a href=\"http:\/\/www\.proxynova\.com\/proxy-server-list\/port-\d+|\d+).*?\s*<\/td>
    


    Example here

    So a couple of things you've missed -
    * The \ delimiter at the end
    * Ignoring the rest of the characters after the capturing group once your selection has been met.

    Code sample -

    preg_match_all(
        "/<td align=\"left\">\s*(<a href=\"http:\/\/www.proxynova.com\/proxy-server-list\/port-\d+|\d+).*?\s*<\/td>/",
        $str,
        $output_array
    );