I have been handed a legacy xml which is not going to change. In formatted way it looks like this:
<Result>
<StepSequence>
<RealMeasure>
<Text value="Batman"/>
</RealMeasure>
</StepSequence>
<StepSequence>
<RealMeasure>
<Text value="Superman"/>
</RealMeasure>
</StepSequence>
</Result>
Actually it comes like this:
<Result><StepSequence><RealMeasure><Text value="Batman"/></RealMeasure></StepSequence><StepSequence><RealMeasure><Text value="Superman"/></RealMeasure></StepSequence></Result>
Regex I have come up with is:
<RealMeasure><((\w*)\s+value="(.*)".*?)></RealMeasure>
But it is selecting data:
<RealMeasure><Text value="Batman"/></RealMeasure></StepSequence><StepSequence><RealMeasure><Text value="Superman"/></RealMeasure>
I want to select:
<RealMeasure><Text value="Batman"/></RealMeasure>
and
<RealMeasure><Text value="Superman"/></RealMeasure>
I want to get groups so that I can later convert the match to something like:
<RealMeasure type="Text" value="Superman"/>
using pattern like:
<RealMeasure type="$2" value=$3>
Any tips to improve my regex?
Try this -
let reg = /<RealMeasure><((\w+)\s+value="(.*?)".*?)><\/RealMeasure>/g;
let str= `<Result><StepSequence><RealMeasure><Text value="Batman"/></RealMeasure></StepSequence><StepSequence><RealMeasure><Text value="Superman"/></RealMeasure></StepSequence></Result>`;
str.replace(reg, `<RealMeasure type="$2" value="$3"/>`); //<Result><StepSequence><RealMeasure type="Text" value="Batman"/></StepSequence><StepSequence><RealMeasure type="Text" value="Superman"/></StepSequence></Result>
The group value="(.*?)"
has to be non-greedy as well. And changed the (\w*)
to (\w+)
to ensure that type is not empty.
Also, /
in </RealMeasure>
has to be escaped like <\/RealMeasure>
.