Search code examples
expressionelispconstant-expression

How can I tell if an elisp expression is pure and constant?


At some point, Emacs added the pure symbol property, indicating which functions are known to be pure (see here). Is there a way to use this property to determine if an entire expression is constant and side-effect-free? For example, it's pretty easy to determine that (car '(1 2 3)) is a constant expression, since car is a pure function (i.e. (get 'car 'pure) returns t) and '(1 2 3) is a quoted form. However, there are more complex expressions, e.g. involving special forms, which are still constant:

(let ((a (+ 1 2))
      (b (- 5 5)))
  (if (> b 0)
      a
    (- a)))

Notably, neither let nor if is marked as pure, but this is still a constant expression evaluating to -3.

Is there any way I can take an arbitrary expression and determine whether it is constant? I know there is unsafep, which appears to implement the necessary expression tree walking, but it is evaluating a different criterion than "constant", so I can't use it directly.

  • It's fine to assume that none of the definitions of standard Elisp functions, macros, or special forms (car, let, if, etc.) have been modified.
  • For my purposes, "constant" is defined using equal. So a function that returns a new list on each call with the same contents every time would be considered constant.
  • I know there are some constant expressions that involve impure functions, e.g. (nreverse '(1 2 3)). I don't mind if the algorithm misses such expressions.

In case it matters, the reason I want to do this is that I'm implementing an elisp macro where constant expressions occurring in a certain context are syntactically valid, but their return value will be ignored, making them pointless, and likely the result of a programmer misunderstanding. So I want to issue a warning if the macro sees a constant expression in this context.


Solution

  • Is there any way I can take an arbitrary expression and determine whether it is constant?

    No.


    First, whenever lisp interpreter sees a symbol it's interned in obarray (object array) if it's never seen before. So in a very strict sense, most functions cannot be side-effect free.

    Also, by definition some operations has side effects. For example, IO has side effects as it reads from stdin or writes to stdout/stderr.

    So we need to limit the definition of side-effect-free to not changing external states. For example, the following must not be side-effect-free:

    ;; If a function modifies one of its input argument
    (defvar foo nil)
    (funcall (lambda () (setq foo t)))
    
    ;; If a function modifies a free variable. Here `free variable` means any non-constant ;; symbol from the enclosing environment
    (funcall (lambda (x) (setq x t)) foo)
    (funcall (lambda (x) (makunbound x)) 'foo)
    
    ;; If a function initialize a variable in the enclosing environment
    (funcall (lambda (x) (defvar bar nil)))
    

    Second, pure function must be side-effect-free AND returns the same output given same input.

    constant in this context doesn't mean a symbol has an invariable value. It means the value of sexp can be known at compile time. This is related to, but not equal, referential transparency. If a sexp is constant, the byte compiler can simply replace the sexp by its value.

    That is, (declare (pure t)) tells the compiler that, if all input arguments of this function are constant (they're known at compile time), the output can be computed at compile time.


    To determine whether a function is side-effect-free, you just need to check the symbol property side-effect-free or pure of the symbol. Or you can use:

    (defun side-effect-free-p (function)
      (and (symbolp function)
           (or (function-get function 'pure)
               (funciton-get function 'side-effect-free))))
    

    Similarly,

    (defun pure-p (function)
      (and (symbolp fucntion)
           (function-get function 'pure)))
    

    The vast majority of functions are neither side-effect-free nor pure, but they can be side-effect-free or pure in specific invocation. For example,

    ;; `let` - pure and side-effect free
    (let ((foo nil))
      nil)
    
    ;; `let` - neither pure nor side-effect free
    (defvar bar t)
    (let ((foo (setq bar nil))) ; modified outside state, bar
      nil)
    
    ;; `let` - not pure but side-effect free
    (let ((foo (current-time))) ; the first argument to let, not constant
      foo)
    (let ()           ; empty local binding specification
      (current-time)) ; the second argument to let, it's not constant