Search code examples
javaparsingchomsky-normal-form

Creating a parse code for Chomsky Normal Form Expansion


I want to code this but I am stuck

So suppose we have a grammar

S→x|LR|LT
T→SR
L→(
R→)

Here is how the list would look after each loop:

0 steps: [ S ]
1 step: [ x, LR, LT ]
2 steps: [ (R, L), (T, LSR ] 
3 steps: [ (), (), (SR, (SR, LxR, LLRR, LLTR, LS) ]

and so on

Assume I want to check the string "(xx)" if it is in the grammar so I will do 2n-1 iteration which is 2x4-1=7 steps.

I am stuck as how to code to see the following:

Suppose I am on step 2. Now I want to expand LR. I loop over LR and I expand L to corrsponding RHS values which will be (R. This is done. Then I want to expand R in LR now I must use L and not ( so that I can achieve L). While looping how can I get L when my index moves to R?

Assume I am expanding S->LR the RHS rhs is a list of lists

for(int j=0;j<rhs.size();j++){//size of list
      //size of every inside list such as {LR} 
      for(int k=0;k<rhs.get(j).length();k++){

            //compare every variable with L and if matches right hand side RHS of L
            //then move to R 


          }

My question

When expanding nth term how to add remaining right hand terms to current expansion and how to add left hand terms of the current expansion.

Example: I am expaning LR. If L then (R so i must add R with ( . Then when I got (R i must again get back L and expand R so I get L) ..so My final expansion of L will be(L and R)

Thanks


Solution

  • As you traverse a string and expand each character by its language rules, you create a new string by expanding in place. You don't transform the original string.

    For instance, if you have LLTR, and you're expanding T, you can create a new string using substrings like [LL] + expand(T) + [R]

    One way to do this is by maintaining a prefix, an index character, and a postfix. As you expand at the index, just prepend the prefix and append the postfix. You'll likely want to create a new list of from the rhs list, rather than transforming rhs itself, otherwise you have to deal with maintaining the loop index over a list with a changing size.

    The following seems to work:

    import java.util.ArrayList;
    import java.util.Arrays;
    import java.util.List;
    import java.util.stream.Collectors;
    
    
    
    class Main{
    
        public static List<String> expand(char c){
            switch(c){
                case 'S':
                    return Arrays.asList(new String[]{"x", "LR", "LT"});
                case 'T':
                    return Arrays.asList(new String[]{"SR"});
                case 'L':
                    return Arrays.asList(new String[]{"("});
                case 'R':
                    return Arrays.asList(new String[]{")"});
                default:
                    throw new RuntimeException("Value not in language");
            }
        }
    
        public static Boolean isTerminal(char c){
            return c == 'x' || c == '(' || c == ')';
        }
    
        // For all expansions of c, prepend pre and append post
        public static List<String> expandInPlace(String pre, char c, String post){
            return expand(c).stream().map(str -> pre + str + post).collect(Collectors.toList());
        }
    
        public static void main(String[] args) {
            // Value entered on command line is string to check
            String checkString = args[0];
            int iterations = 2 * checkString.length() - 1;
    
            // Start with root of language "S"
            List<String> rhs = Arrays.asList(new String[]{"S"});
            for(int i=0; i<iterations; i++){
                // nextRhs is new list created by expanding the current list along language rules
                List<String> nextRhs = new ArrayList<>();
                for(int j=0; j<rhs.size();  
                    String pre = "";
                    String post = rhs.get(j);
                    for(int k=0; k<rhs.get(j).length();k++){
                        // Treating pre and post like a double stack
                        // Popping the next character off the front of post
                        char expansionChar = post.charAt(0);
                        post = post.substring(1);
                        if(!isTerminal(expansionChar)){
                            nextRhs.addAll(expandInPlace(pre, expansionChar, post));
                        }
                        pre += expansionChar;
                    }
                }
                rhs = nextRhs;
                System.out.println(rhs);
                System.out.println();
            }
    
            if(rhs.contains(checkString)){
                System.out.println(checkString + " is in the language");
            } else {
                System.out.println(checkString + " is not in the language");
            }
    
        }
    }