Search code examples
javascripthtmlregexmathml

How to match string between two words, and repeat this pattern for all two defined words in the string, Regex?


So I want to extract MathML from HTML. For example, I have this string:

<p>Task:&nbsp;</p><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>x</mi><mo>+</mo><mn>2</mn><mo>=</mo><mn>5</mn></mrow></math><p>&nbsp;find&nbsp;</p><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup></math><p>.</p>

I want to match
<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>x</mi><mo>+</mo><mn>2</mn><mo>=</mo><mn>5</mn></mrow></math> and <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup></math>

How can I achieve this. I've tried this expression /(<math)(.*)(math>)/g but it matches everything between first <math and last math> words.


Solution

  • By default, the quantifiers are greedy in nature, You just need to make it lazy by placing ? after the *

    const str = `<p>Task:&nbsp;</p><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>x</mi><mo>+</mo><mn>2</mn><mo>=</mo><mn>5</mn></mrow></math><p>&nbsp;find&nbsp;</p><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup></math><p>.</p>`;
    
    const regex = /(<math)(.*?)(math>)/g;
    
    const result = str.match(regex);
    
    console.log(result.length);
    console.log(result);