NVidia CUDA: Difference between Tesla T10 processors and Tesla M2090 processor

I have a CUDA code that performs finite difference computation. The code works well on Tesla M2090 processors with no error. The same code results in lots of error in Tesla T10 processor. I am getting lots of zeros in my results.

Do anyone know the difference between these two architecture and solution to how to solve the problem

Solution

Tesla C1060 (based on Tesla T10) is of Compute Capability 1.3 Tesla M2090 is much newer architecture, based on Fermi (2.0 or 2.1) There may be two issues:

Do you recompile your source for 2.0 or 2.1 architecture? If you compile for 1.3 architecture, it will not work for >=2.0.

CUDA Programming Guide 3.1.2 Binary Compatibility:

Binary compatibility is guaranteed from one minor revision to the next one, but not from one minor revision to the previous one or across major revisions.

Also, Fermi behaves slightly differently. Some unsafe code might work correctly on old architectures, while on Fermi it catches the bug. If you suspect that, you can check the "Fermi Compatibility Guide" (available with the CUDA toolkit) to learn about the major differences between the architectures from the programmer's point of view.