I have a string, and within it there will be combinations of [ ] [[ ]] ][ but I need to replace the single [ and ] with < and > but leave alone (don't match) anything that is between [[ ]].
I thought I could do this with a regex, but I'm really struggling to get it to work because the complexity is just beyond me at the moment.
Example string:
[a] [b] <- should replace every [ with < and every ] with > so <a> <b>
[a][b] <- should replace every [ with < and every ] with > so <a><b>
[[abc][a][b]] <- should not replace anything. will always start with [[ and end with ]]
So thinking about this logically, I can do it in a loop with PHP but I really want to try and use a preg_replace if possible.
The logic, as far as I can decipher is to replace [ with < and ] with > EXCEPT between a [[ and ]] but I'm not sure if I can even do that in a regex. I can make it work partially by using lookahead/lookbehind but that still then matches [ and ] between [[ and ]] (e.g. [[ [a] ]].
So far I've got
/(?<!(^|)\[)\[[^\]\[\[]*\]/gmi
Working to spot [a] but not [[a]] but fails if I have [[a [b] c]]. At this point I'm not worried about the replacement, I just need to get the regex working to match / not match.
You can use
preg_replace('~(\[\[(?:(?!\[\[|]]).|(?1))*]])(*SKIP)(*F)|\[([^][]*)]~s', '<$2>', $text)
See the PHP demo and the regex demo.
Details:
(\[\[(?:(?!\[\[|]]).|(?1))*]])(*SKIP)(*F)
- Group 1: [[
, zero or more occurrences of any char that is not a starting point of the [[
or ]]
char sequences or the whole Group 1 pattern recursed, and then ]]
, and once the match is found, it is skipped, the new search starts at the failure location|
- or\[([^][]*)]
- a [
, then zero or more chars other than [
and ]
captured into Group 2, and then a ]
.