I have a file which has hundreds of links like this:
<h3>aspnet</h3>
<a href="http://example.com/1" icon="data:image/png;base64,iwl1zecylifzn3fz9fr3l4cdjqhigcmjo9m">Ex 1</a>
<a href="http://example.com/2" icon="data:image/png;base64,ivborw0kggoaaaansuheugaaaqcayaaaaf8">Ex 2</a>
<a href="http://example.com/3" icon="data:image/png;base64,jmiaw+f2pwdohka6t+hnyfanbkwoa1olmug">Ex 3</a>
So I want to remove all the elements
icon="data:image/png;base64,ivborw0kggoaaaansuheugaaabaaaaaqcayaaaaf8..."
from all the lines. I went through the official Notepad++ regex wiki and have come up with this after several trials:
icon=\"[^\.]+\"
The problem with this is, it is selecting past the second double quote and stopping at the next occurring double quote. To illustrate, this will select the following content:
icon="data:image/png;base64,...jbvebich4sec9zgth1sfue1cdt...">EX 1</a> <a href="
If I modify the above regex to,
icon=\"[^\.]+\">
Then it is almost perfect, but it is also selecting the >
:
icon="data:image/png;base64,...jbvebich4sec9zgth1sfue1cdt...">
The regex I am looking for would select like this:
icon="data:image/png;base64,...jbvebich4sec9zgth1sfue1cdt..."
I also tried the following, but it doesn't match anything at all
icon=\"[^\.]+\"$
Just match anything but a quote, followed by a quote:
icon="[^"]+"
Just tested with notepad++ 6.2.2 and confirmed that this matches correctly as written.
Broken down:
icon="
This is fairly obvious, match the literal text icon="
.
[^"]+
This means to match any character that is not a "
. Adding the +
after it means "one or more times."
Finally we match another literal "
.