Search code examples
schemelispracketeval

Scheme Lisp - Using eval for arbitrary arithmetic expressions


I'm trying to evaluate arbitrary and nested arithmetic expressions in scheme with eval. The expressions are build up from numbers, operators, constant bindings and parentheses. For example, (eval (+ 1 1)) or (eval (+ (* 1 2) 3))

The problem is, the expressions could also have unbalanced parentheses, like (eval ((+ 1 1)), and the reader in scheme would just wait when the expression is entered till all parentheses are closed and somehow matched.

I looked into quotation in scheme and by using a try-catch-mechanism in scheme (How to implement a try-catch block in scheme?) - but that didn't work. I'm also thinking about implementing in scheme my own version of an evaluator for arithmetic expressions, maybe with brackets as delimiters.

I search for a way to evaluate arbitrary expressions, and if the expression is not valid, to get an error message like 'not valid expression' from eval, and not have the reader waiting for closing parentheses.

Edit: See below for an implementation of a solution.


Solution

  • If, as you say, the expressions you are trying to parse may have unbalanced parens, then those expressions clearly live in some unstructured data which is, almost certainly a sequence of characters of some kind. Since the syntax of the expressions you want to read is a subset of the syntax of Scheme, then you can simply use the tool that the language provides for you: read: you definitely do not want to write your own parser for this unless you like spending a lot of time getting corner cases right.

    Note that any parser which turns a sequence of characters into some object must at some point ask for the next character from the sequence. There are two things that can happen in response to that:

    • it can get a character, possibly after waiting for something on the other end of that sequence – a human perhaps – to generate the character;
    • it can get an indication that there are no more characters – an end-of-file indicator of some kind.

    No magic you can do can avoid that. So the problem you describe of the reader waiting for the close-paren is not real: any parser you write will have the same problem, when reading from an interactive stream: it simply has to wait for the human on the other end of that stream to either type something, or to indicate that the stream is complete, at which point it can decide whether what it saw was well-formed or not.

    So the answer to the turning-sequences-of-characters-into-object part of your problem is to use the read provided by the language. Here is one approach to doing that – since you have specified that you are using Racket, this uses Racket, rather than RnRS Scheme for any n. Almost all of this code would not be needed if you did not care about restricting the reader somewhat (and I am not sure I have restricted it enough). The rest of it deals with handling exceptions from the reader, and turning them into multiple values.

    (define (read-thing source)
      ;; Read some object from a source port.
      ;; Return two values: either the object read and #t,
      ;; or some exception object and #f if something went wrong.
      ;;
      ;; This tries to defang the reader but I am not sure it does enough
      (with-handlers ([exn? (λ (e) (values e #f))])
        (call-with-default-reading-parameterization
         (thunk
          (parameterize ([read-accept-lang #f]
                         [read-accept-reader #f])
            (values (read source) #t))))))
    
    (define (read-thing-from-file f)
      ;; read a thing from a file
      (call-with-input-file f read-thing))
    
    (define (read-thing-from-string s)
      ;; read a thing from a string
      (call-with-input-string s read-thing))
    

    So now we can try this:

    > (read-thing-from-string "foo")
    'foo
    #t
    > (read-thing-from-string "(foo")
    (exn:fail:read:eof
     "read: expected a `)` to close `(`"
     #<continuation-mark-set>
     (list (srcloc 'string #f #f 1 1)))
    #f
    

    As you can see the second value tells you whether read raised an exception or not.

    Now you can use this thing-reader to feed an evaluator. For instance you can use a function like this:

    (define (read-and-evaluate reader source
                               evaluator environment)
      ;; Read something with a read-thing compatible reader from source
      ;; and hand it to evaluator with environment as a second argument
      ;; If read-thing indicates an error then simply raise it again.
      (call-with-values
       (thunk (reader source))
       (λ (thing good)
         (when (not good)
           (raise thing))
         (evaluator thing environment))))
    

    So now, the problem is reduced to writing evaluator, which I'm not going to do.

    Here is an example of this function being used with a trivial evaluator which simply returns the form it has been given.

    > (read-and-evaluate
       read-thing-from-string "(foo)"
       (λ (form env) form) #f)
    '(foo)
    > (read-and-evaluate
       read-thing-from-string "(foo"
       (λ (form env) form) #f)
    ; string::1: read: expected a `)` to close `(` [,bt for context]
    

    Of course, the whole handling-the-exception-in-the-reader thing is not needed here, since all I end up doing it reraising it, but it does show you how you can handle such an error.


    A note on writing an evaluator. It is tempting to say that, well, (+ 1 2) is a valid Scheme expression, and Scheme has eval so just use that. This is like using a thermonuclear device to demolish a house: it does a good job of demolishing the house but it has undesirable consequences such as killing hundreds of thousands of people. Consider, for instance, that (system "rm -rf $HOME") is also a valid Racket expression, but one you may not want to run.

    To avoid this kind of code-injection attack you want to restrict the evaluator as much as you possibly can, so it will evaluate only the language you are interested in. The same applies to the reader: above the full Racket reader can, I am almost sure, evaluate arbitrary Racket code at read-time: I've tried (but probably failed) to defang it above.

    Racket has a significant set of tools which let you define a more restricted language, for which eval would be safe. However for a really simple language such as evaluating arithmetic expressions it is almost certainly simpler to simply write your own evaluator. It certainly is more educational to do so.