I am trying to remove spaces from inside HTML IDs on headings, and replace them with a - character. So far, I have been doing a multi step process instead. I would like to condense this to one step using regex. I have been trying to make a regex pattern that will highlight instances of a character, inside a variable pattern, but I have not had much success.
The regex should replace 2 spaces here:
<h2 id="three word sentence">
The regex should replace 3 spaces here:
<h2 id="four words in sentence">
This is what I have so far, which finds the entire ID on each item. Then I turn on "find in selection" and replace spaces with -
.
(?<=<h[234] id=").*(?=")
How can I find just the spaces in one step?
You can use
(?:\G(?!\A)|<h\d+\s+id=")[^"\s]*\K\s+(?=[^"]*")
See the regex demo. Details:
(?:\G(?!\A)|<h\d+\s+id=")
- either the end of the previous successful match or <h
, one or more digits, one or more whitespaces and id="
string[^"\s]*
- zero or more chars other than "
and whitespace\K
- match reset operator that discards the text matched so far from the overall match memory buffer\s+
- one or more whitespaces(?=[^"]*")
- a positive lookahead that requires zero or more chars other than "
and then a "
char immediately to the right of the current position.