Search code examples
pythoncartesian-productgray-codeconstant-time

Cartesian product in Gray code order : including affected set in this order?


Having an excellent solution to: Cartesian product in Gray code order with itertools?, is there a way to add something simple to this solution to also report the set (its index) that underwent the change in going from one element to the next of the Cartesian product in Gray code order? That is, a gray_code_product_with_change(['a','b','c'], [0,1], ['x','y']) which would produce something like:

(('a',0,'x'), -1)
(('a',0,'y'), 2)
(('a',1,'y'), 1)
(('a',1,'x'), 2)
(('b',1,'x'), 0)
(('b',1,'y'), 2)
(('b',0,'y'), 1)
(('b',0,'x'), 2)
(('c',0,'x'), 0)
(('c',0,'y'), 2)
(('c',1,'y'), 1)
(('c',1,'x'), 2)

I want to avoid taking the "difference" between consecutive tuples, but to have constant-time updates --- hence the Gray code order thing to begin with. One solution could be to write an index_changed iterator, i.e., index_changed(3,2,2) would return the sequence -1,2,1,2,0,2,1,2,0,2,1,2 that I want, but can something even simpler be added to the solution above to achieve the same result?


Solution

  • There are several things wrong with this question, but I'll keep it like this, rather than only making it worse by turning it into a "chameleon question"

    Indeed, why even ask for the elements of the Cartesian product in Gray code order, when you have this "index changed" sequence? So I suppose what I was really looking for was efficient computation of this sequence. So I ended up implementing the above-mentioned gray_code_product_with_change, which takes a base set of sets, e.g., ['a','b','c'], [0,1], ['x','y'], computing this "index changed" sequence, and updating this base set of sets as it moves through the sequence. Since the implementation ended up being more interesting than I thought, I figured I would share, should someone find it useful:

    (Disclaimer: probably not the most pythonic code, rather almost C-like)

    def gray_code_product_with_change(*args, repeat=1) :
    
        sets = args * repeat
        s = [len(x) - 1 for x in sets]
        n = len(s)
    
        # setup parity array and first combination
        p = n * [True] # True: move foward (False: move backward)
        c = n * [0] # inital combo: all 0's (first element of each set)
    
        # emit the first combination
        yield tuple(sets[i][x] for i, x in enumerate(c))
    
        # incrementally update combination in Gray code order
        has_next = True
        while has_next :
    
            # look for the smallest index to increment/decrement
            has_next = False
            for j in range(n-1,-1,-1) :
    
                if p[j] : # currently moving forward..
    
                    if c[j] < s[j] :
                        c[j] += 1
                        has_next = True
    
                        # emit affected set (forward direction)
                        yield j
    
                else : # ..moving backward
    
                    if c[j] > 0 :
                        c[j] -= 1
                        has_next = True
    
                        # emit affected set (reverse direction)
                        yield -j
    
                # we did manage to increment/decrement at position j..
                if has_next :
    
                    # emit the combination
                    yield tuple(sets[i][x] for i, x in enumerate(c))
    
                    for q in range(n-1,j,-1) : # cascade
                        p[q] = not p[q]
    
                    break
    

    Trying to tease out as much performance as I could in just computing this sequence --- since the number of elements in the Cartesian product of a set of sets grows exponentially with the number of sets (of size 2 or more) --- I implemented this in C. What it essentially does, is implement the above-mentioned index_changed (using a slightly different notation):

    (Disclaimer: there is much room for optimization here)

    void gray_code_sequence(int s[], int n) {
    
      // set up parity array
      int p[n];
      for(int i = 0; i < n; ++i) {
        p[i] = 1; // 1: move forward, (1: move backward)
      }
    
      // initialize / emit first combination
      int c[n];
      printf("(");
      for(int i = 0; i < n-1; ++i) {
        c[i] = 0; // initial combo: all 0s (first element of each set)
        printf("%d, ", c[i]); // emit the first combination    
      }
      c[n-1] = 0;
      printf("%d)\n", c[n-1]);
    
      int has_next = 1;
      while(has_next) {
    
        // look for the smallest index to increment/decrement
        has_next = 0;
        for(int j = n-1; j >= 0; --j) {
    
          if(p[j] > 0) { // currently moving forward..
    
            if(c[j] < s[j]) {
              c[j] += 1;
              has_next = 1;
    
              printf("%d\n", j);
            }
          }
    
          else { // ..moving backward
    
            if(c[j] > 0) {
              c[j] -= 1;
              has_next = 1;
    
              printf("%d\n", -j);
            }
          }
    
          if(has_next) {
    
            for(int q = n-1; q > j; --q) {
              p[q] = -1 * p[q]; // cascade
            }
    
            break;
          }
        }
      }
    }
    

    When compared to the above python (where the yielding of the elements of the Cartesian product is suppressed, and only the elements of the sequence are yielded, so that the output is essentially the same, for a fair comparison), this C implementation seems to be about 15 times as fast, asymptotically.

    Again this C code could be highly optimized (the irony that python code is so C-like being well-noted), for example, this parity array could stored in a single int type, performing bit shift >> operations, etc., so I bet that even a 30 or 40x speedup could be achieved.