Search code examples
macroscommon-lispsbcl

How can I modify the #+ and #- readtable macros in Lisp?


Short version:

I want to change the #+ and #- reader macros to apply to all immediately subsequent tokens starting with #@, in addition to the following token. Therefore, the following code...

#+somefeature
#@someattribute1
#@someattribute2
(defun ...)

...would, in the absence of somefeature, result in no code.


Long version:

I have written my own readtable-macros which apply transformations to subsequent code. For example:

#@traced
(defun ...)

This yields a function that writes its arguments and return values to a file, for debugging.

This fails, however, when used in conjunction with the #+ reader macro:

#+somefeature
#@traced
(defun ...)

In the absence of somefeature, the function continues to be defined, albeit without the #@traced modification. This is obviously not the desired outcome.

One possible solution would be to use progn, as follows:

#+somefeature
(progn
  #@traced
  (defun ...))

But that's kind of ugly.

I would like to modify the #+ and #- reader macros, such that they may consume more than one token. Something like this:

(defun conditional-syntax-reader (stream subchar arg)
  ; If the conditional fails, consume subsequent tokens while they
  ; start with #@, then consume the next token.
)
(setf *readtable* (copy-readtable))
(set-dispatch-macro-character #\# #\+ #'conditional-syntax-reader)
(set-dispatch-macro-character #\# #\- #'conditional-syntax-reader)

The problem is, I don't know how to "delegate" to the original reader macros; and I don't understand enough about how they were implemented to re-implement them myself in their entirety.

A naive approach would be:

(defun consume-tokens-recursively (stream)
  (let ((token (read stream t nil t)))
    (when (string= "#@" (subseq (symbol-string token) 0 2))
      (consume-tokens-recursively stream))))    ; recurse
(defun conditional-syntax-reader (stream subchar arg)
  (unless (member (read stream t nil t) *features*)
    (consume-tokens-recursively stream)))

However, I'm given to believe that this wouldn't be sufficient:

The #+ syntax operates by first reading the feature specification and then skipping over the form if the feature is false. This skipping of a form is a bit tricky because of the possibility of user-defined macro characters and side effects caused by the #. and #, constructions. It is accomplished by binding the variable read-suppress to a non-nil value and then calling the read function.

This seems to imply that I can just let ((*read-suppress* t)) when using read to solve the issue. Is that right?


EDIT 1

Upon further analysis, it seems the problem is caused by not knowing how many tokens to consume. Consider the following attributes:

  • #@export expects one argument: the (defun ...) to export.

  • #@traced expects two arguments: the debug level and the (defun ...) to trace.

Example:

#+somefeature
#@export
#@traced 3
(defun ...)

It turns out that #+ and #- are capable of suppressing all these tokens; but there is a huge problem!

When under a suppressing #+ or #-, (read) returns NIL!

Example:

(defun annotation-syntax-reader (stream subchar arg)
 (case (read stream t nil t)
  ('export
   (let ((defun-form   (read stream t nil t)))))
    ; do something
  ('traced
   (let* ((debug-level (read stream t nil t))
          (defun-form  (read stream t nil t)))))))
    ; do something

(setf *readtable* (copy-readtable))
(set-dispatch-macro-character #\# #\@ #'annotation-syntax-reader)

#+(or) #@traced 3 (defun ...)

The #@traced token is being suppressed by the #+. In this situation, all the (read) calls in (annotation-syntax-reader) consume real tokens but return NIL!

Therefore, the traced token is consumed, but the case fails. No additional tokens are thus consumed; and control leaves the scope of the #+.

The (defun ...) clause is executed as normal, and the function comes into being. Clearly not the desired outcome.


Solution

  • The standard readtable

    Changing the macros for #+ and #- is a bit excessive solution I think, but in any case remember to not actually change the standard readtable (as you did, but its important to repeat in the answer)

    The consequences are undefined if an attempt is made to modify the standard readtable. To achieve the effect of altering or extending standard syntax, a copy of the standard readtable can be created; see the function copy-readtable.

    §2.1.1.2 The Standard Readtable

    Now, maybe I'm missing something (please give us a hint about how your reader macro is defined if so), but I think it is possible to avoid that and write your custom macros in a way that works for your use case.

    Reader macro

    Let's define a simple macro as follows:

    CL-USER> (defun my-reader (stream char)
               (declare (ignore char))
               (let ((name (read stream)
                     (form (read stream))
               (unless *read-suppress*
                 `(with-decoration ,name ,form)))
    MY-READER
    

    [NB: This was edited to take into account *read-suppress*: the code always read two forms, but returns nil in case it is being ignored. In the comments you say that you may need to read an indefinite number of forms based on the name of the decoration, but with *read-suppress* the recursive calls to read return nil for symbols, so you don't know which decoration is being applied. In that case it might be better to wrap some arguments in a literal list, or parse the stream manually (read-char, etc.). Also, since you are using a dispatching macro, maybe you can add a numerical argument if you want the decoration to be applied to more than one form (#2@inline), but that could be a bad idea when later the decorated code is being modified.]

    Here the reader does a minimal job, namely build a form that is intended to be macroexpanded later. I don't even need to define with-decoration for now, as I'm interested in the read step. The intent is to read the next token (presumably a symbol that indicates what decoration is being applied, and a form to decorate).

    I'm binding this macro to a unused character:

    CL-USER> (set-macro-character #\§ 'my-reader)
    T
    

    Here when I test the macro it wraps the following form:

    CL-USER> (read-from-string "§test (defun)")
    (WITH-DECORATION TEST (DEFUN))
    13 (4 bits, #xD, #o15, #b1101)
    

    And here it works with a preceding QUOTE too, the apostrophe reader grabs the next form, which recursively reads two forms:

    CL-USER> '§test (defun)
    (WITH-DECORATION TEST (DEFUN))
    

    Likewise, a conditional reader macro will ignore all the next lines:

    CL-USER> #+(or) t
    ; No values
    
    CL-USER> #+(or) §test (defun)
    ; No values
    
    CL-USER> #+(or) §one §two §three (defun)
    ; No values
    

    Decoration macro

    If you use this syntax, you'll have nested decorated forms:

    CL-USER> '§one §two (defun test ())
    (WITH-DECORATION ONE (WITH-DECORATION TWO (DEFUN TEST ())))
    

    With respect to defun in toplevel positions, you can arrange for your macros to unwrap the nesting (not completely tested, there might be bugs):

    (defun unwrap-decorations (form stack)
      (etypecase form
        (cons (destructuring-bind (head . tail) form
                (case head
                  (with-decoration (destructuring-bind (token form) tail
                    (unwrap-decorations form (cons token stack))))
                  (t `(with-decorations ,(reverse stack) ,form)))))))
    
    CL-USER> (unwrap-decorations ** nil)
    (WITH-DECORATIONS (ONE TWO) (DEFUN TEST ()))
    

    And in turn, with-decorations might know about DEFUN forms and how to annotate them as necessary.

    For the moment, our original macro is only the following (it needs more error checking):

    (defmacro with-decoration (&whole whole &rest args)
      (unwrap-decorations whole nil))
    

    For the sake of our example, let's define a generic annotation mechanism:

    CL-USER> (defgeneric expand-decoration (type name rest))
    #<STANDARD-GENERIC-FUNCTION COMMON-LISP-USER::EXPAND-DECORATION (0)>
    

    It is used in with-decorations to dispatch on an appropriate expander for each decoration. Keep in mind that all the efforts here are to keep defun in a top-level positions (under a progn), a recursive annotation would let evaluation happens (in the case of defun, it would result in the name of the function being defined), and the annotation could be done on the result.

    The main macro is then here, with a kind of fold (reduce) mechanism where the forms are decorated using the resulting expansion so far. This allows for expanders to place code before or after the main form (or do other fancy things):

    (defmacro with-decorations ((&rest decorations) form)
      (etypecase form
        (cons (destructuring-bind (head . tail) form
                (ecase head
                  (defun (destructuring-bind (name args . body) tail
                           `(progn
                              ,@(loop
                                  for b = `((defun ,name ,args ,@body)) then forms
                                  for d in decorations
                                  for forms = (expand-decoration d name b)
                                  finally (return forms))))))))))
    

    (nb. here above we only care about defun but the loop should probably be done outside of the dispatching thing, along with a way to indicate to expander methods that a function is being expanded; well, it could be better)

    Say, for example, you want to declare a function as inline, then the declaration must happen before (so that the compiler can know the source code must be kept):

    (defmethod expand-decoration ((_ (eql 'inline)) name rest)
      `((declaim (inline ,name)) ,@rest))
    

    Likewise, if you want to export the name of the function being defined, you can export it after the function is defined (order is not really important here):

    (defmethod expand-decoration ((_ (eql 'export)) name rest)
      `(,@rest (export ',name)))
    

    The resulting code allows you to have a single (progn ...) form with a defun in toplevel position:

    CL-USER> (macroexpand '§inline §export (defun my-test-fn () "hello"))
    (PROGN
     (DECLAIM (INLINE MY-TEST-FN))
     (DEFUN MY-TEST-FN () "hello")
     (EXPORT 'MY-TEST-FN))