Search code examples
javascriptregexregex-lookaroundsregex-groupregex-greedy

Regex for capturing image src attribute


I'm trying to extract all image links within double quotations.

I'm able to get text within quotes by using

/"([^"]*)"/

but I want to get only those values which match following pattern

"https://text/text/.../text.jpg?text=text&text=..."

(... Represents similar values)

How can I achieve this?


Solution

  • If the url has to start with http and optional s and it has to contain .jpg you might make your pattern a bit more specific:

    "(https?:\/\/[^"\s]+\/\S+?\.jpg[^"\s]*)"
    
    • "( Match opening " and start capturing group
      • https?:\/\/ Match http with optional s and ://
      • [^"\s]+ Match not " or a whitespace char
      • \/\S+?\.jpg Match a forward slash, 1+ times a non whitespace char non greedy and .jpg
      • [^"\s]* Match 0+ times not a whitespace char or " to match what follows the file extension
    • )" Close capturing group and match closing "

    Regex demo

    let pattern = /"(https?:\/\/[^"\s]+\/\S+?\.jpg[^"\s]*)"/;
    [
      '"https://text/text/.../text.jpg?text=text&text=..."',
      '"https://text/text/.../text.jpg?t&ext=text&text=..."',
      '"https://text/text/.../text.jpg?text=text"'
    ].forEach(s => console.log(s.match(pattern)[1]))