Using SBCL's implementation of Common Lisp, is it somehow possible to present a REPL where usage of specific keywords is prohibited (effectively providing access to only a subset of Common Lisp's functionality)?
(defpackage :limited-repl
(:use :cl :named-readtables)
(:export #:within-sandbox
#:limited-repl))
(in-package :limited-repl)
In this answer I define a within-sandbox
macro and a limited-repl
function, such that:
#\:
and #\#
are forbidden, to prevent the user from accessing symbols in other packages, and to avoid using syntax that might create arrays, bitvectors, etc. as well as reader variables (#1=
and #1#
), which could be used to create circular structures that loops infinitely when evaluated.COMMON-LISP:NIL
, in order to avoid introducing this symbol in the custom language; instead, it reads as zero.I am using the named-readtables
library, which is available in Quicklisp, in order to define a custom readtable. This can be achieved with Common Lisp too if necessary.
(ql:quickload :named-readtables)
Let's define a custom condition for forbidden-characters, this is a bit cleaner than using (error "some string")
:
(define-condition forbidden-character (error)
((character :initarg :character :reader forbidden-character.character)
(stream :initarg :stream :reader forbidden-character.stream))
(:report (lambda (condition stream)
(format stream
"Forbidden character ~@c"
(forbidden-character.character condition)))))
Let's also define a reader function that signals an error for the character being read:
(defun forbidden-character (stream character)
(error 'forbidden-character
:character character
:stream stream))
Here is the list reader that returns zero on an empty list:
(defun read-zero-list (stream character)
(assert (char= character #\())
(let ((list (read-delimited-list #\) stream t)))
(etypecase list
(null 0)
(cons list))))
The limited-repl
readtable is based on the standard readtable, except for two forbidden characters and a custom list reader:
(defreadtable limited-repl
(:merge :standard)
(:macro-char #\( 'read-read-zero-list)
(:macro-char #\: 'forbidden-character nil)
(:macro-char #\# 'forbidden-character nil))
The limited-language
package defines the base language the user can access, here only basic arithmetic symbols and the quit
symbol:
(defpackage limited-language
(:use)
(:export #:+
#:-
#:*
#:/
#:quit))
I also redefine the functions as wrapper around the CL functions, as follows:
(macrolet ((lift (s c) `(defun ,s (&rest args) (apply #',c args))))
(lift limited-language:- cl:-)
(lift limited-language:+ cl:+)
(lift limited-language:* cl:*)
(lift limited-language:/ cl:/))
This is necessary because just reexporting the CL symbols could have unintended effects, like for example the user accessing the variables *
or /
which are bound in Lisp, which we do not want here.
The following auxiliary function calls function
in a context where *package*
is bound to a fresh, temporary package that uses limited-language
. I am using GENTEMP
to generate a fresh symbol in a package-names
package:
(defpackage package-names
(:use))
(defun call-with-temporary-package (function)
(let* ((symbol (gentemp "SANDBOX-" 'package-names))
(package (make-package symbol :use '(limited-language))))
(unwind-protect (let ((*package* package))
(funcall function))
(delete-package package)
(unintern symbol 'package-names))))
When no REPL is running, package-names
is empty, as the generated symbols are uninterned on unwinding. The temporary package is also deleted. There might be a more robust way to define temporary names for packages, this should be enough as a first draft.
The within-sandbox
macro establishes the context where the readtable and the package is set to the ones we want. I also bind *read-eval*
to nil, just to be sure no code is evaluated while read
is performed:
(defmacro within-sandbox (&rest body)
`(call-with-temporary-package
(lambda ()
(let ((*readtable* (find-readtable 'limited-repl))
(*read-eval* nil))
,@body))))
Finally, the REPL is defined as follows, where quit
is used to quit the REPL:
(defun limited-repl ()
(within-sandbox
(loop
(format t "~&> ")
(finish-output)
(clear-input)
(handler-case (let ((form (read)))
(when (eq form 'limited-language:quit)
(return))
(eval form))
(cl:end-of-file ()
(return))
(:no-error (v)
(print v))
(error (e)
(format *error-output* "~&Error: ~a~%" e))))))
CL-USER> (limited-repl:limited-repl)
> 5
5
> (+ () (* 4 3) (/ 7 9))
115/9
> #1=(list 0 . #1#)
Error: Forbidden character #\#
> *
Error: The variable * is unbound.
> (cl:in-package 'cl-user)
Error: Forbidden character #\:
> quit
NIL
CL-USER>
When limiting the symbols the user can access, you can rely on eval
. However, this can still introduce CL specific semantics in your language, like when writing *
alone reports an undefined variable *
(maybe your language has no variable concept). You would then need to write your own evaluate
function, which can delegate to eval
, but which can also do other things outside of Common Lisp.
There might be some other things I failed to consider for the sandboxed environment (execution times, etc.), be careful, but that should be already useful as-is.