Search code examples
javascriptnode.jscasperjs

Same output but different character length


I have this script:

    var last_build_no = this.getTitle();
    var plain_build_no = "#53 ";
    console.log(last_build_no.length);
    console.log(plain_build_no.length);

And this is the output:

5
4
'#5​3 '
'#53 '

What could be the reason of this difference and how can I convert this strings in same format ?

enter image description here

Because of this difference my test case is failing but the strings I saw looks same:

test.assertEquals(last_build_no, plain_build_no, "Last Build page has expected title");

Solution

  • The string contains a "zero width space". You can see it if you log the character codes:

    last_build_no.split("").forEach(c => console.log(c.charCodeAt(0)));
    
    /* 
      Outputs:
      35
      53
      8203  <-- http://www.fileformat.info/info/unicode/char/200b/index.htm
      51
      32
    */
    

    Unicode has the following zero-width characters:

    • U+200B zero width space
    • U+200C zero width non-joiner Unicode code point
    • U+200D zero width joiner Unicode code point
    • U+FEFF zero width no-break space Unicode code point

    You can remove it with a simple regular expression:

    var last_build_no = '#5​3 '.replace(/[\u200B-\u200D\uFEFF]/g, '');
    console.log(last_build_no.length);  // Output: 4
    

    See this SO answer for more info