Search code examples
javascriptnewlineline-by-linecharacter-codes

Is it possible to test for breaking spaces?


I'm trying to parse minimal mark-up text by lines. Currently I have a for loop that parses letter by letter. See the code below:

Text:

<element id="myE">
This is some text that
represents accurately the way I 
have written my html
file.
</element>

code:

var list = document.getElementById("myE").innerHTML;
var tallie = 0;

for (i=1;i<list.length;i++) {
  if (/*list[i] == " "*/ true) {
    list += 1;
    console.log(list[i]);
  }
}

console.log(tallie);

As expected, the text embedded in the element renders in the DOM as though it were a continuous, properly formatted string. But what I'm finding is that the console recognizes the difference between a non-breaking space and a new line. where " " and

"
"

represent the two respectively.

Since the console appears to know the difference, it seems there should be a way to test for the difference. If you unlock the commented condition, it will start testing for non-breaking spaces. I think there is another way to do this using the character encoding string (not &nbsp, another one). It seems reasonable then to expect to be able to find a character code for a breaking space. Unfortunately I can not find one.

Long story short, how can I achieve a true line by line parsing of an html file?


Solution

  • Newline characters are encoded with \n. Sometimes you will also find combinations of carriage return and new line \r\n (see wikipedia on Newline). These should not be confused with a Non Breaking Space &nbsp; or &#160; which are used if you want the browser to not word wrap but still display a space or if you want the browser to not collapse multiple spaces together.