Search code examples
common-lispansi-common-lisp

Given an alist of constant values, what is the most efficient lookup?


You have a list of key-value pairs known at compile time. We don't want to lookup at runtime, but would rather generate a lookup table. For example, we could do this (from a piece of code I wrote yesterday):

(defun mod-mask (char)
  (case char
    (#\A #.(ash 1 22))
    (#\s #.(ash 1 23))
    (#\H #.(ash 1 24))
    (#\S #.(ash 1 25))
    (#\C #.(ash 1 26))
    (#\M #.(ash 1 27))))

(defun mask-mod (mask)
  (case mask
    (#.(ash 1 22) #\A)
    (#.(ash 1 23) #\s)
    (#.(ash 1 24) #\H)
    (#.(ash 1 25) #\S)
    (#.(ash 1 26) #\C)
    (#.(ash 1 27) #\M)))

But you prefer not to repeat yourself, and that tidy, simple code that everyone understands what it does, is just really boring!

Joke aside, the above example is short, but one could have lots of options, a really large number. We don't want to type much, because it is error prone and tedious. Also, a list has an advantage of lookup elsewhere: for example we can see which elements are defined, which we can't get out of a compiled case form. List can also be runtime generated, as well as code, unlike manually typed case statements.

Instead of the above, we make a list:

(defvar *modifier-masks*
  `((#\A . ,(ash 1 22))
    (#\s . ,(ash 1 23))
    (#\H . ,(ash 1 24))
    (#\S . ,(ash 1 25))
    (#\C . ,(ash 1 26))
    (#\M . ,(ash 1 27))))

And create a macro to generate the above two cases:

(defmacro mod-lookup-gen (&optional bit)
  `(lambda (mask)
     (case mask
       ,@(mapcar (lambda (e)
                   (if bit
                       `(,(car e) ,(cdr e))
                       `(,(cdr e) ,(car e))))
          *modifier-masks*))))

Which we can use in simple wrappers:

(defun mask-mod (mask)
  (let ((get-char (mod-lookup-gen)))
    (funcall get-char mask)))

(defun mod-mask (modifier)
  (let ((get-char (mod-lookup-gen 'bit)))
    (funcall get-char modifier)))

The question: what would be a better option, and more efficient compile-time generated code, equivalent to the above two defuns? Is there a way to get rid of the lambda wrapping the case-statement to skip the extra function call? Other than perhaps inlining it?

In general, what is the most efficient way to lookup key-value pairs in Common Lisp? Are "switch" expressions the fastest way to lookup values, since the compiler can lay out the static code for lookup at compile time? I guess it depends on the implementation, but perhaps it is possible to give some general guidance?

Also note that difference between a hashmap and an alist is that an alist, and consequently a switch statement generated from it, care about the order of statements, i.e. they are matched in the order of declarations, while hashmap is an unordered lookup. That may or may not be important, depending on the use-case.

As a side note, this is "enum" idiom. I have seen some CL libraries for generating enums, but I haven't used any, so I don't know what kind of code they generate. Perhaps someone can recommend one that generates good compile-time lookups?


Solution

  • The answer to this question is: write code and measure it. It is (perhaps) clear that in the limiting case of very many keys something based on a hashtable will be faster than something based on case. But where that breakpoint is depends on fine details of the implementation. And of course a sufficiently aggressive compiler could always turn case into a hashtable if it wants to.

    For your example I wrote the code below, and I found, on one implementation that the case implementation was a few times faster than the hashtable one with the alist one being somewhere in between. On another implementation the case implementation was fastest, the hashtable one slower and the alist one slowest of all.

    You have to measure if you want to know things about performance.

    (eval-when (:compile-toplevel :load-toplevel :execute)
      (defvar *modifier-masks*
        `((#\A . ,(ash 1 22))
          (#\s . ,(ash 1 23))
          (#\H . ,(ash 1 24))
          (#\S . ,(ash 1 25))
          (#\C . ,(ash 1 26))
          (#\M . ,(ash 1 27)))))
    
    (declaim (inline modifier-case-map mask-case-map))
    
    (defun modifier-case-map (c)
      (declare (type character c))
      (macrolet ((cmap (v)
                   `(ecase ,v
                      ,@(mapcar (lambda (m)
                                  `(,(car m) ,(cdr m)))
                                *modifier-masks*))))
        (cmap c)))
    
    (defun mask-case-map (m)
      (declare (type fixnum m))
      (macrolet ((mmap (v)
                   `(ecase ,v
                      ,@(mapcar (lambda (m)
                                  `(,(cdr m) ,(car m)))
                                *modifier-masks*))))
        (mmap m)))
    
    (declaim (inline modifier-hash-map mask-hash-map))
    
    (defun modifier-hash-map (c)
      (declare (type character c))
      (or (gethash c (load-time-value (let ((h (make-hash-table
                                                :size (length *modifier-masks*))))
                                        (dolist (m *modifier-masks* h)
                                          (setf (gethash (car m) h) (cdr m))))))
          (error "not present")))
    
    (defun mask-hash-map (m)
      (declare (type fixnum m))             ;?
      (or (gethash m (load-time-value (let ((h (make-hash-table
                                                :size (length *modifier-masks*))))
                                        (dolist (m *modifier-masks* h)
                                          (setf (gethash (cdr m) h) (car m))))))
          (error "not present")))
    
    (declaim (inline modifier-alist-map mask-alist-map))
    
    (defun modifier-alist-map (c)
      (declare (type character c))
      (or (cdr (assoc c (load-time-value *modifier-masks* t)))
          (error "not present")))
    
    (defun mask-alist-map (m)
      (declare (type fixnum m))             ;?
      (or (gethash m (load-time-value *modifier-masks* t))
          (error "not present")))
    
    ;;; All the setfery is to try to make sure things do not get optimized
    ;;; away
    
    (defun tsc (n)
      (let ((r nil))
        (dotimes (i n r)
          (dolist (c (load-time-value (mapcar #'car *modifier-masks*)))
            (setf r (modifier-case-map c))))))
    
    (defun tsh (n)
      (let ((r nil))
        (dotimes (i n r)
          (dolist (c (load-time-value (mapcar #'car *modifier-masks*)))
            (setf r (modifier-hash-map c))))))
      
    (defun tsa (n)
      (let ((r nil))
        (dotimes (i n r)
          (dolist (c (load-time-value (mapcar #'car *modifier-masks*)))
            (setf r (modifier-alist-map c))))))