I have HTML code that I would like to clean up with search/replace regex. I have many instances where I have more than one space between words that I want to remove with regex but I want it to ignore the HTML indents at the beginning of each line. The expression \h{2,4} delete all space that are between 2 and 4 but how can I get it to ignore the indent at the beginning? Here is a sample HTML code:
<tr>
<td><strong>Vamos a sentarnos.</strong></td>
<td><strong>Let's sit down.</strong></td>
</tr>
<tr>
<td>veamos (ver)</td>
<td>let's see (to see)</td>
</tr>
Thanks
See if this works for you. It replaces any spaces if there are more than 2 of them and they are not at the beggining of the line.
(?<!^)\h\K\h+
Replace by "nothing"
Explained:
(?<!^) # not a previous begin of line
\h # one horizontal space
\K # ignore previous match
\h+ # one or more horizontal spaces
Optional approach
([^\n]\h)\h+
Replace by $1
or even: ([^\n][^\S\r\n])[^\S\r\n]+
(if \h
is not supported) Replace by $1