Search code examples
regexnotepad++

Notepad++ regex to match and replace first comma between certain tags


I have data like so opened in Notepad++:

<title>Name 1, Address 1, NY</address>
<title>Name 2, Address 2, California</address>
<title>Name 3, Address 3, Texas</address>
<title>Name 4</title> <address>Address 4, Utah</address> <-- this line is 100% correct

...and I would llike to target the first comma on the groups that need proper tag enclosing, and replace it by: </title><address>

I did this and it targets the second group (address), but the replace regex I don't know what to use to keep the integrity of the address data, I tried something like this but it destroys the whole address:

  • Find What: , (.+address)
  • Replace with: </title><address>(.+address)

How to simply replace the first comma by new tags </title><address>?


Solution

  • You can use

    <title>[^<>,\v]*\K,\h*([^<>\v]*</address>)
    

    Replace with </title> <address>\1.

    See the regex demo.

    Details

    • <title> - a string <title>
    • [^<>,\v]* - zero or more chars other than <, >, comma and any vertical whitespace
    • \K - match reset operator that disacards all text matched so far
    • , - a comma
    • \h* - zero or more horizontal whitespaces
    • ([^<>\v]*</address>) - Group 1 (the $1 or \1 backreference refers to the group value):
      • [^<>\v]* - zero or more chars other than <, > and any vertical whitespace
      • </address> - a </address> string.