Search code examples
clojuredatomic

Can you eagerly load all data into memory with Datomic?


I've previously asked this question on how to improve performance with Datomic but I have yet to find a good solution. One thing that struck me was that when using the Datomic Console to execute the query I got the impression that the query was MUCH faster. But I also noticed a great increase of startup time and memory consumption when using the Datomic Console compared to when I start my application standalone. This to me implies that Datomic Console pulls all data into memory before I explore the contents.

  1. Am I right that this is the case?
  2. If so, is this something I could do myself programmatically from a peer?
  3. If (2) then how can this be done in Clojure?

Solution

  • As described here in the Datomic Documentation, the Peer Library loads index segments in the (in-process) Object Cache when it fetches them for querying.

    1. Am I right that this is the case?

    I doubt that the Datomic Console explicitly chooses to pull all datoms into memory, but it is possible that the Datomic console eagerly traverses a large chunk of your data in order to show its dashboard.

    1. If so, is this something I could do myself programmatically from a peer?

    Well, I guess you could always artificially scan through all the segments. One easy way to do this is via the Datoms API.

    1. If (2) then how can this be done in Clojure?
    (defn scan-whole-db [db]
      (doseq [index [:eavt :aevt :avet :vaet]]
        (dorun (seq (d/datoms db index)))))
    

    That all being said, I'm not sure at all you should expect performance improvements from this strategy. Your Object Cache had better be large enough!