I wrote a Bash script that utilizes regular expressions in order to process my text files written in markdown so that they can be ready to be compiled to .html files. One task of this processing is meant to change lazily written text that looks like this:
Verkle trees are shaping up to be an important part of Ethereum's upcoming scaling upgrades. They serve the same function as Merkle trees: [you can put a large amount of data into a Verkle tree3], and make a short proof ("witness") of any single piece, or set of pieces, of that data that can be verified by someone who only has the root of the tree.
into text that when compiled will display a link:
Verkle trees are shaping up to be an important part of Ethereum's upcoming scaling upgrades. They serve the same function as Merkle trees: [you can put a large amount of data into a Verkle tree][3], and make a short proof ("witness") of any single piece, or set of pieces, of that data that can be verified by someone who only has the root of the tree.
In other words, I want to replace [text that begins with a letter and consists of words, letters and punctuation. The text ends with a positive integer smaller than 100 int]
with [text that begins with a letter and consists of words, letters and punctuation.The text ends with a bracket][int]
The following is the code snipped that is meant to do the task that I described above:
sed -i -E 's/(\[[a-zA-Z]{2,2}[\s\S]{1,100})(\[0-9]{1,2}\])/\1\][\2/g' file.txt;
The code is meant to save me from the effort writing additional '][' and will do it for me automatically. The code does not work and I have no clue why.
The regex you wrote is PCRE-compliant, but you need a POSIX one since sed
only support POSIX BRE or ERE.
You can use
sed -i -E 's/(\[[[:alpha:]]{2}([^][]*[^0-9])?)([0-9]{1,2}])/\1][\3/g' file
See the online demo:
s='Verkle trees are shaping up to be an important part of Ethereum'"'"'s upcoming scaling upgrades. They serve the same function as Merkle trees: [you can put a large amount of data into a Verkle tree3], and make a short proof ("witness") of any single piece, or set of pieces, of that data that can be verified by someone who only has the root of the tree.'
sed -E 's/(\[[[:alpha:]]{2}([^][]*[^0-9])?)([0-9]{1,2}])/\1][\3/g' <<< "$s"
Output:
Verkle trees are shaping up to be an important part of Ethereum's upcoming scaling upgrades. They serve the same function as Merkle trees: [you can put a large amount of data into a Verkle tree][3], and make a short proof ("witness") of any single piece, or set of pieces, of that data that can be verified by someone who only has the root of the tree.
Details:
(\[[[:alpha:]]{2}([^][]*[^0-9])?)
- Group 1 (\1
):
\[
- a [
char[[:alpha:]]{2}
- two letters([^][]*[^0-9])?
- an optional sequence of zero or more chars other than [
and ]
and then a non-digit char([0-9]{1,2}])
- Group 3 (\3
): one or two digits.The replacement is \1][\3
, Group 1 + ][
+ Group 3 values concatenated (Group 2 is not used as it is only meant to match an optional part inside Group 1, and sed
, POSIX, regex flavor does not support non-capturing groups).