Search code examples
javascriptregexecmascript-6regex-groupregex-negation

Matching image path inside javascript using javascript regex


I need a regex to scan JS files for any image paths it finds.

These paths would generally be nested as follows:

$img1 = "foo/bar.png";
$img2 = 'foo/bar.jpg';
$img3 = "{'myimg':'foo/bar.png'}";

I need a regex which will be able to pick up the whole image path inside the quotes, but sometimes nested inside a json string, or otherwise encoded... essentially, matching a whole image path by detecting just the existence of the extension (jpg|png|gif).

I have found a regex that works well in php, I need one that works with javascript.

$pattern = '~/?+(?>[^"\'/]++/)+[^"\'\s]+?\.(?>(?>pn|jpe?)g|gif)\b~';

How the form of regex pattern in javascript?


Solution

  • Possessive quantifiers ++ and atomic groups (?> are not supported in Javascript.

    The updated pattern could look like this:

    \/?(?:[^"'/]+\/)+[^"'\s]+?\.(?:(?:pn|jpe?)g|gif)\b
    

    But to get those matches and if // in the path is also ok, you can exclude matching the quotes using a negated character class [^"']* only

    Note to escape the \/ as the regex delimiters in Javscript are / and that you don't have to escape the ' and " in the character class.

    The shorter version could look like

    [^"']+\.(?:(?:pn|jpe?)g|gif)\b
    
    • [^"']+ Match any char except ' or " 1+ times
    • \. Match a dot
    • (?: Non capture group
      • (?:pn|jpe?)g Match either png jpg or jpeg
      • | Or
      • gif Match literally
    • )\b Close non capture group followed by a word boundary

    Regex demo

    const regex = /[^"']+\.(?:(?:pn|jpe?)g|gif)\b/;
    [
      "foo/bar.png",
      "foo/bar.jpg",
      "{'myimg':'foo/bar.png'}"
    ].forEach(s => console.log(s.match(regex)[0]));