Search code examples
regexregex-negationregex-groupregex-greedyregexp-replace

Regex expression to replace different values based on multiple criterias


I am trying to find a REGEX expression (or multiple ones) that i can use in Notepad++ (Windows) that will change different parts of URLs based on specific symbols that divide them.

Basically i have this type of URLs in a big file, they are scaterred around:

domain.com/folder/random-text-1?REPLACE-THIS-1=REPLACE-THIS-2=REPLACE-THIS-3&REPLACE-THIS-4

The REPLACE-THIS-1/-2/-3/-4 always differs from each other, even between different URLs, like this:

domain.com/folder/random-text-2?A1=A2=A3&A4
domain.com/folder/random-text-2?B1=B2=B3&B4
etc

(Note: "folder" stays the same all the time)

This is what I am trying to do with REGEX expression(s):

  • "REPLACE-THIS-1" with "REPLACED-1" Note: so basically when it finds the first "?" it will replace the "REPLACE-THIS-1" after it with a specific one "REPLACED-1" up to it gets to "=" symbol
  • "REPLACE-THIS-2" with "REPLACED-2" Note: this one is between first "=" found, and second "=" found
  • "REPLACE-THIS-3" with "REPLACED-3" Note: this is the text after second "=" found
  • "REPLACE-THIS-4" with "REPLACED-4" Note: the problem is that "REPLACE-THIS-4" includes sometimes the "=" symbol as well, but i want to replace all of the text after the "REPLACE-THIS-3&"

Note: REPLACED-1/REPLACED-2/REPLACED-3/REPLACED-4 are always the same, they don't change

The biggest problem i am facing is that all these replace-1/-2-/3/-4 are different amongst each other (i mean the all URLs have different values), so it'd be great if i can use a regex/multiple regex formulas to replace!

Thank you so much!


Solution

  • This regex should do what you want. It looks for a URL that contains /folder/ followed by some random text, a ? and then something of the form a=b=c&d where a, b, c and d may contain any non-blank characters (other than = for a and b, and & for c).

    (\/folder\/[^?]+\?)[^=\s]+=[^=\s]+=[^&\s]+&.*(?=\s|$)
    

    This is replaced with

    \1REPLACE-1=REPLACE-2=REPLACE-3&REPLACE-4
    

    Demo on regex101