Search code examples
node.jsundefined-behaviorcomparison-operators

Why does Node.js allow this seemingly invalid character sequence?


I was looking to see if there were a way to distinguish between a return in a file (next line), and a typed newline (\n in the file). While I was playing around in the REPL, I made a typo in a comparison, and Node.js to my surprise didn't care. It even gave what I believe is undefined behavior, unless I completely missed something in my years of Node.js intimacy. And I also discovered a couple other things in my playing around, I'll ask those below.

Code is at the bottom of the post.

The main question is:

Why is Node.js not complaining about the syntax at the last two comparisons (==+ and ==-)? Is that somehow valid syntax somewhere? And why does it make the comparison true when without the trailing +/- it is false? (updates in post comments)

The main side question is:

Why do the 'Buffer separate self comparison' and 'Buffer comparison' results come out as false when all the other tests are true? And why does a buffer not compare with a buffer of the same data?

Also:

How can I reliably distinguish between a return in a file and a typed newline as described above?

Here's the code:


const nl = '\n'
const newline = `
`

const NL = Buffer.from('\n')
const NEWLINE = Buffer.from(`
`)
const NEWLINE2 = Buffer.from(`
`)
console.log("Buffer separate self comparison: "+(NEWLINE2 == NEWLINE))
console.log("Buffer comparison: "+(NL == NEWLINE))
console.log("Non buffer comparison: "+(nl == newline))
console.log("Buffer self comparison 1: "+(NL == NL))
console.log("Buffer self comparison 2: "+(NEWLINE == NEWLINE))
console.log("Buffer/String comparison 1: "+(nl == NL))
console.log("Buffer/String comparison 2: "+(newline == NEWLINE))
console.log("Buffer/String cross comparison 1: "+(nl == NEWLINE))
console.log("Buffer/String cross comparison 2: "+(newline == NL))
console.log("Buffer toString comparison: "+(NL.toString() == NEWLINE.toString()))
console.log("Strange operator comparison 1: "+(NL ==+ NEWLINE))
console.log("Strange operator comparison 2: "+(NL ==- NEWLINE))

Solution

  • NEWLINE2 == NEWLINE (false)
    NL == NEWLINE (false)
    

    An expression comparing Objects is only true if the operands reference the same Object. src

    This is not the case: they're two separate objects, even if their initial values are alike, so the result is false.

    Edit: If you want to compare the values and not the identity of two Buffers, you can use Buffer.compare. Buffer.compare(NEWLINE2, NEWLINE) === 0 means both are equal.

    nl == newline (true)
    

    Two strings are strictly equal when they have the same sequence of characters, same length, and same characters in corresponding positions. src

    The strings are equal, so true.

    NL == NL (true)
    NEWLINE == NEWLINE (true)
    

    An expression comparing Objects is only true if the operands reference the same Object. src

    nl == NL (true)
    newline == NEWLINE (true)
    nl == NEWLINE (true)
    newline == NL (true)
    

    What's happening here is that you're comparing two different types. One is a string, the other an object.

    Each of these operators will coerce its operands to primitives before a comparison is made. If both end up as strings, they are compared using lexicographic order, otherwise they are cast to numbers to be compared. A comparison against NaN will always yield false. src

    Buffer has a toString method, so that is called in order to have the same primitive types on both side of the ==. The result of this method is a string containing \n. '\n' == '\n' is true.

    As an aside, if your comparison was NEWLINE == 0, then this would happen:

    ' 1 ' == 1 equals true. When casting, whitespace is discarded so ' 1 ' will be cast into a number with value 1. The resulting comparison would be 1 == 1.

    A string of only whitespace characters will be coerced into 0. The Buffer is first converted to a string and then to an integer, so this would happen: 0 == 0, so the result would've been true.

    NL.toString() == NEWLINE.toString() (true)
    

    Two strings are strictly equal when they have the same sequence of characters, same length, and same characters in corresponding positions. src

    The strings are equal, so true.

    NL ==+ NEWLINE (true)
    NL ==- NEWLINE (true)
    

    This is the same as doing == +NEWLINE. You're using a unary + or - to explicitly cast to a Number. What's interesting here is that you're doing these comparisons, after casting: 0 == +0 and 0 == -0. Negative and positive zero are considered equal.

    None of the behavior here is 'undefined'.

    Apart from "huh, that's neat" there's really very little reason to not use the strict equality operator (===) which would not cast things into the same primitives.


    As for your question:

    A newline in a file (\n) is the same as a newline in a self-typed string ('\n'). They're both ASCII or Unicode character 0x0A, byte-wise.

    Some documents contain both a newline character and a carriage return. A newline then consists of two characters: 0x0D 0x0A (or \r\n).