Search code examples
garbage-collectionlispcommon-lisphashtablesbcl

Suspected SBCL Garbage Collection Bug for Hash Table


I'm using SBCL 1.4.5 on Ubuntu 18.04.

It seems that the garbage collector in SBCL is not properly freeing up memory bound to hash tables with symbol keys and values. Interestingly, when the keys and values are integers, garbage collection works fine.

For example, the following program works correctly:

(defun copy-hash-table (hash)
  (let ((h (make-hash-table
            :test (hash-table-test hash)
            :rehash-size (hash-table-rehash-size hash)
            :rehash-threshold (hash-table-rehash-threshold hash)
            :size (hash-table-size hash))))
    (loop for key being the hash-keys of hash
            using (hash-value value)
          do
             (setf (gethash key h) value)
          finally (return h))))

(defun make-integer-input ()
  (loop
    with hash1 = (make-hash-table) and hash2 =  (make-hash-table)
    for i from 0 to 500
    do
       (setf (gethash (random 100) hash1) (random 100))
       (setf (gethash (random 100) hash2) (random 100))
    finally
       (return (list hash1 hash2))))

(defun do-integer-work (hash1 hash2)
  (loop
    for i being the hash-keys of hash1
    for j being the hash-keys of hash2
    do
       (remhash i hash1)
       (setf (gethash i hash1) (random 100))
       (remhash j hash2)
       (setf (gethash j hash2) (random 100)))
  (values hash1 hash2))

(defun hash-worker (list-obj)
  (loop
    with next
    for i from 1 to 50000
    do
      (multiple-value-bind (new-hash1 new-hash2)
          (do-integer-work (copy-hash-table (first list-obj)) (copy-hash-table (second list-obj)))
        (setq next (list new-hash1 new-hash2))
        (if (> (random 100) 50)
            (setq list-obj next)))))

I ran this program by calling (hash-worker (make-integer-input)). The top-level function, hash-worker, takes in a list of two hash tables, and does work over a copy of the hash tables in do-integer-work. Then, then the helper function outputs two modified hash tables which are saved in new-hash1 and new-hash2. Then the system randomly decides to keep the modified hash tables or not.

The do-integer-work helper function sequentially removes the keys of the hash tables and resubmits them with new random values.

When this program runs, I observed that the memory consumption was basically constant for the duration of the program. This is not the case when I ran a sister program over hash tables with symbol keys and values.

(defun copy-hash-table (hash)
  (let ((h (make-hash-table
            :test (hash-table-test hash)
            :rehash-size (hash-table-rehash-size hash)
            :rehash-threshold (hash-table-rehash-threshold hash)
            :size (hash-table-size hash))))
    (loop for key being the hash-keys of hash
            using (hash-value value)
          do
             (setf (gethash key h) value)
          finally (return h))))

(defun make-symbol-input ()
  (loop
    with hash1 = (make-hash-table) and hash2 =  (make-hash-table)
    for i from 0 to 500
    do
       (setf (gethash (gentemp) hash1) (gentemp))
       (setf (gethash (gentemp) hash2) (gentemp))
    finally
    (return (list hash1 hash2))))

(defun do-symbol-work (hash1 hash2)
  (loop
    for i being the hash-keys of hash1
    for j being the hash-keys of hash2
    do
       (remhash i hash1)
       (setf (gethash i hash1) (gentemp))
       (remhash j hash2)
       (setf (gethash j hash2) (gentemp)))
  (values hash1 hash2))

(defun hash-worker (list-obj)
  (loop
    with next
    for i from 1 to 50000
    do
      (multiple-value-bind (new-hash1 new-hash2)
          (do-symbol-work (copy-hash-table (first list-obj)) (copy-hash-table (second list-obj)))
        (setq next (list new-hash1 new-hash2))
        (if (> (random 100) 50)
            (setq list-obj next)))))

I ran this program calling (hash-worker (make-symbol-input)). The difference in this program is that the top-level function calls do-symbol-work. As this program ran, I observed the memory usage of the system steadily increase until my machine ran out of memory.

Is this a known bug in SBCL, and if so, is there a workaround for this?


Solution

  • You're using gentemp, which interns the symbol it creates. Such symbols can't be GCd because they are referenced by their package. So you are generating a huge number of interned symbols and killing the system. Instead, use gensym. There may still be GC bugs in code like this, but this isn't one.