Search code examples
javascriptindexofsubstr

Javascript substring does not extract the correct data


I use javascript with indexOf and substr and I try to get 3 parts.

Expected result

Part1:

<p>Some text</p>

BEFORE

Part2:

<!-- KEYWORDS: one -->

Part3:

<p>Some more text</p>

Real results

Run the code below with the console log and you will see that the parts are all over the place.

class Inside {
  getPos(selector, startsWith, endsWith) {
    const html = document.querySelector(selector).innerHTML;
    let data = {};
    data.pos = {};
    data.html = {};

    data.pos.start = html.indexOf(startsWith);
    data.pos.end = html.indexOf(endsWith, data.pos.start);
    data.pos.finish = html.length;

    data.html.before = html.substr(0, data.pos.start);
    data.html.match = html.substr(data.pos.start, data.pos.end);
    data.html.after = html.substr(data.pos.end, data.pos.finish);

    console.log(data.pos);
    console.log(data.html.before);
    console.log(data.html.match);
    console.log(data.html.after);
  }
}

document.addEventListener("DOMContentLoaded", () => {
  const InsideObj = new Inside();
  InsideObj.getPos('main', '<!-- KEYWORDS:', '-->');
});
<main>
   <p>Some text</p>

   BEFORE
   <!-- KEYWORDS: one -->
   AFTER

    <p>Some more text</p>
  </main>

Question

I can't figure out why it does not add up. Is it substr or indexOf? Is it some kind of multibyte or encoding problem that I need to be aware of?


Solution

  • substr() does not take two string positions, but rather one position and one length like this: substr(startingPosition, subLength).

    Here is a simplified example:

    let str = 'Hello Test';
    
    let startPos = 3, endPos = 5; // We expect a string with 2 chars starting from the 3nd position
    
    console.log(str.substr(startPos, endPos)); // WRONG! 
    console.log(str.substr(startPos, endPos - startPos));

    Here is your fixed code: You need to subtract the starting position from your end position, to get the length between the two (like in the example above). Also, you need to take into account the length of the search parameter itself.

    class Inside {
      getPos(selector, startsWith, endsWith) {
        const html = document.querySelector(selector).innerHTML;
        let data = {};
        data.pos = {};
        data.html = {};
    
        data.pos.start = html.indexOf(startsWith);
        data.pos.end = html.indexOf(endsWith, data.pos.start);
        data.pos.finish = html.length;
    
        data.html.before = html.substr(0, data.pos.start);
        
        // From the start position to end - start plus the length of the string you searched
        data.html.match = html.substr(data.pos.start, data.pos.end - data.pos.start + endsWith.length);
        
        // From the end position + the length of the string you searched to finish - end
        data.html.after = html.substr(data.pos.end + endsWith.length, data.pos.finish - data.pos.end);
    
        console.log(data.pos);
        console.log(data.html.before);
        console.log(data.html.match);
        console.log(data.html.after);
      }
    }
    
    document.addEventListener("DOMContentLoaded", () => {
      const InsideObj = new Inside();
      InsideObj.getPos('main', '<!-- KEYWORDS:', '-->');
    });
    <main>
       <p>Some text</p>
    
       BEFORE
       <!-- KEYWORDS: one -->
       AFTER
    
        <p>Some more text</p>
      </main>