Search code examples
dictionaryclojurelanguage-implementation

Why does binding affect the type of my map?


I was playing around in the REPL and I got some weird behavior:

Clojure 1.4.0
user=> (type {:a 1})
clojure.lang.PersistentArrayMap
user=> (def x {:a 1})
#'user/x
user=> (type x)
clojure.lang.PersistentHashMap

I thought that all small literal maps were instances of PersistentArrayMap, but apparently that's not the case if it's been bound with def. Why would using def cause Clojure to choose a different representation for my litte map? I know it's probably just some strange implementation detail, but I'm curious.


Solution

  • This question made me dig into the Clojure source code. I just spent a few hours putting print statements in the source in order to figure this out.

    It turns out the two map expressions are evaluated through different code paths

    (type {:a 1}) causes Java byte-code to be emitted and ran. The emitted code use clojure.lang.RT.map() to construct the map which returns a PersistentArrayMap for small maps:

    static public IPersistentMap map(Object... init){
        if(init == null)
            return PersistentArrayMap.EMPTY;
        else if(init.length <= PersistentArrayMap.HASHTABLE_THRESHOLD)
            return PersistentArrayMap.createWithCheck(init);
        return PersistentHashMap.createWithCheck(init);
    }
    

    When evaluating (def x {:a 1}) at least from the REPL there's no byte-code emitted. The constant map is parsed as a PersistentHashMap in clojure.lang.Compiler$MapExpr.parse() which returns it warpped it in a ConstantExpr:

    else if(constant)
    {
    IPersistentMap m = PersistentHashMap.EMPTY;
    for(int i=0;i<keyvals.length();i+= 2)
        {
        m = m.assoc(((LiteralExpr)keyvals.nth(i)).val(), ((LiteralExpr)keyvals.nth(i+1)).val());
        }
    //System.err.println("Constant: " + m);
    return new ConstantExpr(m);
    }
    

    The def expression when evaluated binds the value of the ConstantExpr created above which as as said is a PersistentHashMap.

    So why is it implemented this way?

    I don't know. It could be simple oversight or the PersistentArrayMap optimization may not really be worth it.