Search code examples
regexpowershellnotepad++regex-lookaroundslookbehind

regex lookbehind


I have a problem with regex lookbehind!

here is my sample text:

 href="dermatitis>" "blah blah blah >" href="lichen-planus>" 

I want to match all >" if and only if there is an href= somewhere before it and yet there is another rule!

The href= must be immediately before the previous Quotation mark. (for example the second &ght; in text has an href= before it but the href= is not immediately before the previous Quotation mark and I dont want it to be matched) In my text, there is 3 &ght; and I want first and 3rd one to be matched and the second one not matched based on ruled I described above.

I hope the question is explained enough! and I work on some offline text files and I can use notepad++, powershell or any other suitable engine.

Any help will be appreciated.


Solution

  • Notepad++ doesn't understand lookbehind, you have to use \K instead.

    • Ctrl+F
    • Find what: href="[^"]*\K>(?=")
    • check Wrap around
    • check Regular expression
    • Search in document

    Explanation:

    href="[^"]* : search for href=" followed by 0 or more any charcater but "
    \K          : forget all we have seen until this position
    >        : literally >
    (?=")       : lookahead, make sure we have '"' after