Search code examples
regexnotepad++regexp-replace

Regexp regular/recursive find/replace in Notepad++


How to split some strings defined in a specific format:

[length namevalue field]name=value[length namevalue field]name=value[length namevalue field]name=value[length namevalue field]name=value

Is it possible with a Find/Replace regex in Notepad++ isolate the pair name=value replacing [length namevalue field] with a white space? The main problem is related to numeric value where a simple \d{4} search doesn't work.


Eg.

INPUT:

0010name=mario0013surname=rossi0006age=180006phone=0014address=street
0013name=marianna0013surname=rossi0006age=210006phone=0015address=street1
0003name=pia0015surname=rossini0005age=30017phone=+39221122330020address=streetstreet

OUTPUT:

name=mario surname=rossi age=18 phone= address=street
name=mario surname=rossi age=18 phone= address=street
name=marianna surname=rossi age=21 phone= address=street1
name=pia surname=rossini age=3 phone=+3922112233 address=streetstreet

Solution

  • You can use

    \d{4}(?=[[:alpha:]]\w*=)
    \d{4}(?=[^\W\d]\w*=)
    

    See the regex demo.

    The patterns match

    • \d{4} - four digits
    • (?=[[:alpha:]]\w*=) - that are immediately followed with a letter and then any zero or more word chars followed with a = char immediately to the right of the current position.
    • (?=[^\W\d]\w*=) - that are immediately followed with a letter or an underscore and then any zero or more word chars followed with a = char immediately to the right of the current position.

    In Notepad++, if you want to remove the match at the start of the line and replace with space anywhere else, you can use

    ^(\d{4}(?=[[:alpha:]]\w*=))|(?1)
    

    and replace with (?1: ). The above explained pattern, \d{4}(?=[[:alpha:]]\w*=), is matched and captured into Group 1 if it is at the start of a line (^), and just matched anywhere else ((?1) recurses the Group 1 pattern, so as not to repeat it). The (?1: ) replacement means we replace with empty string if Group 1 matched, else, we replace with a space.

    See the demo screenshot:

    enter image description here