Search code examples
phpregexwordpresshyperlinkjpeg

What kind of Regex could I use to delete all my url with .jpg extension and let <img> element instead?


I need to delete all my images links to my wordpress posts with regex.

I use Search Regex Plugin for Wordpress. this plugin find content with regex in database and it can replace it.

Some examples I need to do :

<a rel="nofollow" href="https://www.exemple.com/test.jpg" class="link" title="test">
     <img src="https://www.exemple.com/test.jpg" alt="test">
</a>

to

<img src="https://www.exemple.com/test.jpg" alt="test">

And

<a href="https://www.exemple.com/test1.png" title="test1" class="link">
     <img src="https://www.exemple.com/test1.png" alt="test1">
</a>

to

<img src="https://www.exemple.com/test1.png" alt="test1">

I found some regex solution like here : https://regex101.com/r/xX9pJ8/1 or here https://stackoverflow.com/a/40292492/2831419 but I can't adapt it to my needs. If you have solution, please let me know thanks


Solution

  • As others have mentioned:

    My first point would be that regular expressions may well not be the path you want to take in this case.

    You would be better off setting something up to parse the HTML of your posts, find anchor tags that contain image tags, and then checking the image tag source attribute to see if the extension ends with "jpg" and if so replace the anchor tag with the image tag.


    That aside, using WordPress would make that some amount harder, and it can be done. Please note, as you can read in the link, that this is not what RegEx is for and it will not be able to handle every single situation and link.

    You'll want to first match the anchor tag, verify it links to whatever file extensions, then match the image tag, verify that the image ends with whatever file extensions you want, capture the full image link, then also match the closing anchor tag so that it replaces cleanly.

    This is the expression I came up with, could almost certainly be nicer, but I also wanted it to be a little more verbose and obvious:

    /<a[^>]+href ?= ?["'][^"']+\.(?:jpe?g|png)["'].+\n?\r?[\s]{0,100}<img[^>]+src ?= ?["']([^"']+\.(?:jpe?g|png))["'].+\n?\r?[\s]{0,100}<\/a>/gim
    

    This works in PCRE or JS, on png, PNG, jpg, JPG, jpeg, and JPEG. It will not work if there are multiple line breaks between the anchor tag and the image tag, or with other image extensions unless you add them.

    And you'll simply replace that whole search with: <img src="$1">