3x3 Matrix determinant function - making it faster

I'm writing a bigger program and getting the determinants of 3x3 matrices as fast as possible is pretty important for it to work well. I've read that I could use numPy to do it, but I thought that maybe writing my own code would be more educational as I'm in my 3rd semester of CompSci.

So I wrote two functions and I'm using time.clock() (I'm on a win7 machine) to time how long it takes for each function to return a value.

This is the first function:

def dete(a):
   x = (a[0][0] * a[1][1] * a[2][2]) + (a[1][0] * a[2][1] * a[3][2]) + (a[2][0] * a[3][1] * a[4][2])
   y = (a[0][2] * a[1][1] * a[2][0]) + (a[1][2] * a[2][1] * a[3][0]) + (a[2][2] * a[3][1] * a[4][0])
   return x - y

And this is the second function:

def det(a):
    a.append(a[0]); a.append(a[1]);
    x = 0
    for i in range(0, len(a)-2):
        y=1;        
        for j in range(0, len(a)-2):    
            y *= a[i+j][j]      
        x += y

    p = 0
    for i in range(0, len(a)-2):
        y=1;
        z = 0;
        for j in range(2, -1, -1):  
            y *= a[i+z][j]  
            z+=1        
        z += 1
        p += y  
    return x - p

They both give the correct answers, however the first one seems to be slightly faster, which makes me think that since for-loops are more elegant to use and usually faster, I'm doing something wrong - I made the loops too slow and fat. I tried trimming it down, but it seems that the *= and += operations are taking too much time, there's too many of them. I haven't checked how fast numPy takes care of this problem yet, but I wanna get better at writing efficient code. Any ideas on how to make those loops faster?

Solution

Loops are - more elegant and more generic, but they are not "usually faster" than a couple of inline multiplications in a single expression.

For one, a forloop in python has to assemble the object over whcih uyou will interate (the call to range), and then call a method on that iterator for every item on the loop.

So, depending on what you are doing, if the inline form is speedy enough for you keep it - if it is still too slow (as usually is the case when we are doing numeric computation in Python), you should use a numeric library (fore example NumpY), that can compute determinants in native code. In the case of numeric manipulation code like this, you can it running hundreds of times faster using native code.

If yo9u need some numeric calculation that can't be performed by an already made library, if you seek speed (for example, for pixel manipulation in image processing), you may prefer to write an extension that runs in native code (using either C, Cython, or some other thing) in order to have it fast.

On the other hand, if speed is not crucial, and you even noted the inlined expression is just "slightly faster", just use the full loop - you get more readable and maintainable code - which are the main reasons for using Python after all.

On the specific example you gave, you can get some speed increase in the loop code by hardcoding the "range" calls to tuples - for example, changing: for i in range(0, len(a)-2): to for i in (0, 1, 2) - note that as in the inline case, you loose the ability to work with matrices of different sizes.