Search code examples
phpmysqlregexhtml-tablehtml-parsing

Remove row from HTML table markup in a string


A client has ~7,000 products with a "Your Price: $..." in the description, the price is typed in (there is no existing wildcard).

Here is an example of a description:

<table cellpadding="5" border="0" width="100%"><tbody><tr><td><strong>Part #: </strong></td><td>FIV000-2100</td></tr><tr><td><strong>Retail Price: </strong></td><td>$26.39</td></tr><tr><td class="price"><strong>Your Price: </strong></td><td class="price">$23.75</td></tr><tr><td align="center" colspan="2"/></tr></tbody></table>

Is there a regular expression to use to just remove the Your Price row? What if we wanted to remove the Retail Price row as well?


Solution

  • You can do something like this (provided $str is the html string):

    $pattern = "/<tr><td class=\"price\"><strong>Your Price: <\/strong><\/td><td class=\"price\">\\$[0-9.]+<\/td><\/tr>/";
    $str = preg_replace($pattern, "", $str);
    

    The empty string will replace it with nothing, thus removing it.

    EDIT:

    Escaped some stuff to make it work. I also urge you to use a HTML parser. Let's call this the quick and dirty method.