Search code examples
schemeracketgrammarbnf

How to use symbols and lists in scheme to process data?


I am a newbie in scheme, and I am in the process of writing a function that checks pairwise disjointess of rules (for the time being is incomplete), I used symbols and lists in order to represent the rues of the grammar. Uppercase symbol is a non-terminal in the grammar, and lowercase is a terminal. I am trying to check if a rule passes the pairwise disjointness test.

  1. I will basically check if a rule has only one unique terminal in it. if it is the case, that rule passes the pairwise disjointness test. In scheme, I am thinking to realize that by representing the terminal symbol in lower case. An example of that rule would be:

    '(A <= (a b c))
    
  2. I will then check the case of a rule that contains an or. like:

    '(A <= (a (OR (a b) (a c))))   
    
  3. Finally, I will check recursively for non terminals. A rule for that case would be

    '(A <= (B b c))
    

    However, What is keeping me stuck is how to use those symbols as data in order to be processed and recurse upon it. I thought about converting the symbols to strings, but that did not in case of having a list like that for example '(a b c) How can I do it? Here is what I reached so far:

    #lang racket
    (define grammar
       '(A <= (a A b))
        )
    
    (define (pairwise-disjoint lst)
      (print(symbol->string (car lst)))
      (print( cddr lst))
     )
    

Solution

  • Pairwise Disjoint

    As far as I know, the only way to check if a set is pairwise disjoint is to enumerate every possible pair and check for matches. Note that this does not follow the racket syntax, but the meaning should still be pretty clear.

    (define (contains-match? x lst)
      (cond ((null? x) #f)           ; Nothing to do
            ((null? lst) #f)         ; Finished walking full list
            ((eq? x (car lst)) #t)   ; Found a match, no need to go further
            (else
              (contains-match? x (cdr lst))))) ; recursive call to keep walking
    
    (define (pairwise-disjoint? lst)
      (if (null? lst) #f
        (let ((x (car lst))         ; let inner vars just for readability
              (tail (cdr lst)))
          (not
            ;; for each element, check against all later elements in the list
            (or (contains-match? x tail)
                (contains-match? (car tail) (cdr tail)))))))
    

    It's not clear to me what else you're trying to do, but this is the going to be the general method. Depending on your data, you may need to use a different (or even custom-made) check for equality, but this works as is for normal symbols:

    ]=> (pairwise-disjoint? '(a b c d e))
    ;Value: #t
    
    ]=> (pairwise-disjoint? '(a b c d e a))
    ;Value: #f
    

    Symbols & Data

    This section is based on what I perceive to be a pretty fundamental misunderstanding of scheme basics by OP, and some speculation about what their actual goal is. Please clarify the question if this next bit doesn't help you!

    However, What is keeping me stuck is how to use those symbols as data...

    In scheme, you can associate a symbol with whatever you want. In fact, the define keyword really just tells the interpreter "Whenever I say contains-match? (which is a symbol) I'm actually referring to this big set of instructions over there, so remember that." The interpreter remembers this by storing the symbol and the thing it refers to in a big table so that it can be found later.

    Whenever the interpreter runs into a symbol, it will look in its table to see if it knows what it actually means and substitute the real value, in this case a function.

    ]=> pairwise-disjoint?
    ;Value 2: #[compound-procedure 2 pairwise-disjoint?]
    

    We tell the interpreter to keep the symbol in place rather than substituting by using the quote operator, ' or (quote ...):

    ]=> 'pairwise-disjoint?
    ;Value: pairwise-disjoint?
    

    All that said, using define for your purposes is probably a really poor decision for all of the same reasons that global variables are generally bad.

    To hold the definitions of all your particular symbols important to the grammar, you're probably looking for something like a hash table where each symbol you know about is a key and its particulars are the associated value.

    And, if you want to pass around symbols, you really need to understand the quote and quasiquote.

    Once you have your definitions somewhere that you can find them, the only work that's left to you is writing something like I did above that is maybe a little more tailored to your particular situation.