I have a number of references I would like to have replaced with links to the anchors further down in the text. The links have a very regular form, so it should be quite doable - at least with a script:
A reference "[44]" should be replaced with the following html code: [<a href="ref44">44</a>]
.
That one is easy enough. Simple replacement with a backreference. But is there a regex (vim dialect, python, or ... perl, if must be. The horror!) that can convert the following into similar links: [44,45,77,91]
?
That is, one link per number, where the group of links are surrounded by a pair of square brackets.
Since this involves (theoritcally unbounded) memory, it does not map 1:1 with a FSM, and as such should rather be handled by some kind of pushdown-automaton, not a regex, but some dialects are a lot more powerful, so ...
You could re-run this regex replace until no more replacements are made.
Regex: (\[(?:<a(?=\s|>)(?:[^>=|&)]|='(?:[^']|\\')*'|="(?:[^"]|\\")*"|=[^'"][^\s>]*)*>.*?<\/a>,)*)(\d+)([,\]])
Replace with: $1<a href="ref$2">$2</a>$3
The portion which captures group 1, will match event the most sophisticated complex anchor tags.
Sample Text
[22][44,45,77,91]
After Replacement
First time:
[<a href="ref22">22</a>][<a href="ref44">44</a>,45,77,91]
Second time:
[<a href="ref22">22</a>][<a href="ref44">44</a>,<a href="ref45">45</a>,77,91]
Third time:
[<a href="ref22">22</a>][<a href="ref44">44</a>,<a href="ref45">45</a>,<a href="ref77">77</a>,91]
Fourth time:
[<a href="ref22">22</a>][<a href="ref44">44</a>,<a href="ref45">45</a>,<a href="ref77">77</a>,<a href="ref91">91</a>]