Search code examples
phpabstract-syntax-treerpnshunting-yardpostfix-notation

Apply distributive law on AST (or RPN) => disjunctive normal form


I have expressions like the following:

{1000} AND ({1001} OR {1002} OR {1003})

Allowed operators are OR and AND, expressions can be nested using parenthesis. I already managed to tokenize this string and to convert it to an abstract syntax tree (AST) using the Shunting Yard algorithm, implemented in PHP 5.3. The above expression results in the following:

1000 1001 1002 | 1003 | &


    &
  /   \
1000   |
      / \
     |   1003
    / \
1001  1002

When traversing this tree I want to output the final combinations of numbers a user can choose from. In the given representation this is not possible. What I need is actually the form, after the distributive law was applied:

(1000 & 1001) | (1000 & 1002) | (1000 & 1003)

1000 1001 & 1000 1002 & | 1000 1003 & |

               _______________|_____________
              /                             \
      _______|____                           &
     /            \                         / \
    &              &                    1000   1003
  /   \           / \
1000   1001    1000  1002

I concluded, that the only nodes that are allowed to be &-operator nodes, are the last ones that carry the leafs. All others have to be |-operator nodes.

How to convert an arbitrary AST with the grammar explained above to one that represents all final permutations? Is it better to apply the distributive law on the tokens of the infix representation? Is it easier to work with the RPN representation instead of the tree?

Please also note, that there are more difficult examples possible like:

(1000 & 1008) & (1001 | 1002 | 1003)
1000 1008 & 1001 1002 | 1003 | &
       ______ & ___
      /            \
     &             |
    / \           / \
1000   1008      |   1003
                / \
            1001  1002

Which I'd like to result in:

(1000 & 1008 & 1001) | (1000 & 1008 & 1002) | (1000 & 1008 & 1003)
1000 1008 & 1001 & 1000 1008 & 1002 & | 1000 1008 & 1003 & |

                        __________________|_________
                       /                            \
         _____________|_________                     &
        /                       \                   / \
       &                        &                  &   1003
      /  \                     / \                / \
     &    1001                &   1002        1000   1008
    / \                      / \
1000   1008              1000   1008

For another (more complicated) example just switch left sub tree and right sub tree or add another &-node in place of 1003 => 1003 1009 &

What I already tried: Googling a lot, traversing the tree pre and post order, trying to find an algorithm with no success.

I am grateful for any hints and pointers into the right direction.


Solution

  • Thanks for mentioning the keyword which helped me the most: Disjunctive normal form. I was not aware of actually looking for this transformation.

    I could not find a detailed algorithm description on the internet so I tried to do it by myself. This is how I have done it in pseudo code. Please tell me, if it's not comprehensible.

    - Traverse the AST recursively post order wise
    - If an &-node is found, check if one of the children nodes is a |-node
    - Set orChild and andChild accordingly
    - Traverse the orChild-tree iterative pre order wise and for each OR-leaf push a new &-node with andChild and the OR-leaf value to the stack
    - If you meet another &-node push a new &-node with andChild and the whole &-node you found to the stack
    - After traversing is done, combine the nodes on the stack using an |-node
    - The new sub tree, which has an |-node as root, replaces the &-node you started to traverse from
    - As the outer traversal is post order, the newly created nodes are not traversed and have no effect on further changes
    - Repeat the whole process until the resulting tree does not change anymore