Search code examples
algorithmcompiler-constructionassociationspostfix-notationjvm-bytecode

How to ensure evaluation order and formal parameter order for named argument lists?


Working on a compiler, I am facing a problem related to named argument lists and evaluation order. Basically, the language ensures that for the following function and function call:

int call(int first, int second) = first - second

int sideEffect(int i) { println(i); return i }

println(call(second: sideEffect(0), first: sideEffect(1)))

The output will be

0
1
1

although the arguments are given in reverse order. I am working on the JVM, so they must be sorted on the bytecode level to match the signature of call. In stack-based pseudo-bytecode:

push 0
call sideEffect(int)
push 1
call sideEffect(int)
swap
call call(int, int)

As you can see, a swap bytecode instruction is required here to make sure they are both evaluated and passed in the right order. This can be drawn as a graph:

Actual:   second   first
                \ /
                swap
                / \
Formal:    first   second

There is only one swap instruction that can swap the top two elements on the stack, so for more elements the compiler needs to generate local variables.

Actual: third   second   first
          |       |  /¯¯¯¯
        local0   swap    local0
            ____/ |        |
Formal: first   second   third

In bytecode:

third...
store local0
second...
first...
swap
load local0
call ...

Of course this can be expanded to arbitrary amounts of actual and formal parameters.

What I am now looking for is some kind of algorithm that determines if and where I should insert these swap or local variable instructions. A reference to a compiler implementation would also be helpful.


Note that it is not part of my problem which actual argument belongs to which formal parameter. That is already solved by my compiler. To simplify, the problem could be looked at like this:

Given two arrays of the same sizes containing the same elements in different order, which elements can be swapped or shifted from the first array to get the order of the second one.


Solution

  • With registers only, there's basically only one solution. Iterate through the arguments in desired stack order. If the current argument is already computed, load it from a local. Otherwise, compute the uncomputed arguments before it in execution order, storing them in registers, and then compute the current argument.

    The presence of a swap operation means that we can use the top of the stack as a de facto local variable. Python code below.

    import heapq
    import itertools
    
    
    class Allocator:
    
      def __init__(self):
        self.free_heap = []
        self.next_free = 0
    
      def allocate(self):
        if self.free_heap:
          return heapq.heappop(self.free_heap)
        i = self.next_free
        self.next_free += 1
        return i
    
      def free(self, i):
        heapq.heappush(self.free_heap, i)
    
      def is_allocated(self, i):
        return i < self.next_free and i not in self.free_heap
    
    
    def loads_and_stores(perm):
      perm = list(perm)
      n = len(perm)
      assert set(perm) == set(range(n))
      location = {}
      allocator = Allocator()
      i = iter(perm)
      # "location 0" is the top of the stack
      for j in range(n):
        if j in location:
          if location[j] != 0:
            # not top of stack; need an actual load
            print 'load', location[j]
            if allocator.is_allocated(0):
              # need to maintain the top of the stack
              print 'swap'
          allocator.free(location[j])
        else:
          while True:
            k = next(i)
            print 'compute', k
            if k == j:
              if allocator.is_allocated(0):
                # need to maintain the top of the stack
                print 'swap'
              break
            location[k] = allocator.allocate()
            if location[k] != 0:
              # not top of stack; need an actual store
              print 'store', location[k]
      return perm
    
    
    def test(n):
      for perm in itertools.permutations(range(n)):
        print perm
        loads_and_stores(perm)
    
    
    test(4)
    

    There are some interesting optimizations to do here, because allocate can return any free local. An optimal algorithm would consider the execution cost of making each allocation.