Search code examples
oopcollectionsstandard-libraryhacklang

Hacklang : why were container classes replaced with built-in types?


Just a quote from hack documentation :

Legacy Vector, Map, and Set

These container types should be avoided in new code; use dict, keyset, and vec instead.

Early in Hack's life, the library provided mutable and immutable generic class types called: Vector, ImmVector, Map, ImmMap, Set, and ImmSet. However, these have been replaced by vec, dict, and keyset, whose use is recommended in all new code. Each generic type had a corresponding literal form. For example, a variable of type Vector might be initialized using Vector {22, 33, $v}, where $v is a variable of type int.

I wonder why this change was made. I mean, one of PHP weaknesses is that it has bad oop standard library. Ex : str_replace and array_values methods are outside of the string/array type itself. The PHP standard library is not consistent, sometimes we must pass the array as the first parameter, other times as the second...

I was glad to see that Hack introduced true OOP encapsulation for collections.
Do you know why they stepped back and wrote utility classes such as C\, Dict\, Keyset\ and Vec\ ?
Will there be in the future an addition to add methods to built-in types (ex : Str\starts_with => "toto"->startsWith("t")) ?


Solution

  • Based on Dwayne Reeves' blog post introducing HSL, it seems that the main advantage is the fact that arrays are native values, not objects. This has two important consequences:

    1. For users, the semantics are different when the values cross through arguments. Objects are passed as references, and mutations affect the original object. On the other hand, values are copied on write after passing through arguments, so without references (which are finally to be completely banned in Hack) the callee can't mutate the value of the caller, with the exception of the much stricter inout parameters.

      The article cites the invariance of the mutable containers (Vector, Set, etc.) and generally how shared mutable state couples functions closer together. The soundness issues as discussed in the article are somewhat moot because there were also immutable object containers (ImmVector, ImmSet, etc.), although since these interfaces were written in userland, variance boxed the function type signature into tight constraints. There are tangible differences from this: ImmMap<Tk, +Tv> is invariant in Tk solely because of the (function(Tk): Tv) getter. Meanwhile, dict<+Tk, +Tv> is covariant in both type parameters thanks to the inherent mutation protection from copy-on-write.

    2. For the compiler, static values can be allocated quickly and persist over the lifetime of the server. Objects on the other hand have arbitrarily complicated construction routines in general, and the collection objects weren't going to be special-cased it seems.

    I will also mention that for most use cases, there is minimal difference even in code style: e.g. the -> reference chains can be directly replaced with the |> pipe operator. There is also no longer a boundary between the privileged "standard functions" and custom user functions on collection types. Finally, the collection types were final of course, so their objective nature didn't offer any actual hierarchical or polymorphic advantages to the end user anyways.