Search code examples
javascriptstringclassv8string-length

Isn't string.length actually a method in JavaScript?


I would like to get a better understanding of what is actually going on when I find the length of a string. I tried looking on W3, ECMA, and at the V8 Ignition website but not much luck.

I keep reading that 'JavaScript treats primitive values as objects when executing methods and properties.' But, I can't seem to find out how exactly this happens. If I call a method/property on a primitive which, I assume gets interpreted as an object by Ignition, doesn't the String class need to call a function at some point to iterate the string? I feel like myString.length should be called a method and String.length could MAYBE be called a property, depending on at which point the "property" is found and how it's found.

Basically, I don't understand why it's touted as a property if it doesn't seem to be inherent and has to be fetched/determined. That seems like a method to me (let alone the fact that string.length) isn't even a real thing and is interpreted.


Solution

  • (V8 developer here.)

    I can see several issues here that can be looked at separately:

    1. From a language specification perspective, is something a method or a property?

    Intuitively, the distinction is: if you write a function call like obj.method(), then it's a method; if you write obj.property (no ()), then it's a property.

    Of course in JavaScript, you could also say that everything is a property, and in case the current value of the property is a function, then that makes it a method. So obj.method gets you a reference to that function, and obj.method() gets and immediately calls it:

    var obj = {};
    obj.foo = function() { console.log("function called"); return 42; }
    var x = obj.foo();  // A method!
    var func = obj.foo;  // A property!
    x = func();  // A call!
    obj.foo = 42;
    obj.foo();  // A TypeError!
    

    2. When it looks like a property access, is it always a direct read/write from/to memory, or might some function get executed under the hood?

    The latter. JavaScript itself even provides this capability to objects you can create:

    var obj = {};
    Object.defineProperty(obj, "property", {
      get: function() { console.log("getter was called"); return 42; },
      set: function(x) { console.log("setter was called"); }
    });
    // *Looks* like a pair of property accesses, but will call getter and setter:
    obj.property = obj.property + 1;
    

    The key is that users of this obj don't have to care that getters/setters are involved, to them .property looks like a property. This is of course very much intentional: implementation details of obj are abstracted away; you could modify the part of the code that sets up obj and its .property from a plain property to a getter/setter pair or vice versa without having to worry about updating other parts of the code that read/write it.

    Some built-in objects rely on this trick, the most common example is arrays' .length: while it's specified to be a property with certain "magic" behavior, the most straightforward way for engines to implement this is to use a getter/setter pair under the hood, where in particular the setter does the work of truncating any extra array elements if you set the length to a smaller value than before.

    3. So what does "abc".length do in V8?

    It reads a property directly from memory. All strings in V8 always have a length field internally. As commenters have pointed out, JavaScript strings are immutable, so the internal length field is written only once (when the string is created), and then becomes a read-only property.

    Of course this is an internal implementation detail. Hypothetically, an engine could use a "C-style" string format internally, and then it would have to use a strlen()-like function to determine a string's length when needed. However, on a managed heap, being able to quickly determine each object's size is generally important for performance, so I'd be surprised if an engine actually made this choice. "Pascal-style" strings, where the length is stored explicitly, are more suitable for JavaScript and similar garbage-collected languages.

    So, in particular, I'd say it's fair to assume that reading myString.length in JavaScript is always a very fast operation regardless of the string's length, because it does not iterate the string.

    4. What about String.length?

    Well, this doesn't have anything to do with strings or their lengths :-)

    String is a function (e.g. you can call String(123) to get "123"), and all functions have a length property describing their number of formal parameters:

    function two_params(a, b) { }
    console.log(two_params.length);  // 2
    

    As for whether that's a "simple property" or a getter under the hood: there's no reason to assume that it's not a simple property, but there's also no reason to assume that engines can't internally do whatever they want (so long as there's no observable functional difference) if they think it increases performance or saves memory or simplifies things or improves some other metric they care about :-)
    (And engines can and do make use of this freedom, for various forms of "lazy"/on-demand computation, caching, optimization -- there are plenty of internal function calls that you probably wouldn't expect, and on the flip side what you "clearly see" as a function call in the JS source might (or might not!) get inlined or otherwise optimized away. The details change over time, and across different engines.)