What exactly is NVIDIA Tesla and CUDA?

I'm doing some research on GPGPU and currently struggle with the question what Tesla and CUDA really are. In the paper "NVIDIA Tesla: A unified Graphics and Computing Architecture" it says that the Tesla architecture was introduced with the GeForce 8800. With further reading I was convinced that it is an overall architecture for graphic cards of NVIDIA. Unfortunately this is not really the case. On http://www.nvidia.com/object/why-choose-tesla.html they explicitly seperate GeForce, Quadro and Tesla. And how is all that related to CUDA? Is it just the extension for general computations on GPUs supported by the hardware with Cuda-C? The concept of SM, SIMT, Thread synchronization, Shared Memory, Warp etc. to what is that related? CUDA? Tesla? Furthermore http://nvidia.custhelp.com/app/answers/detail/a_id/2133/~/what-is-the-difference-between-tesla-and-cuda%3F mensions out that Tesla is a family of products designed for high-performance computing and CUDA is only software. Can someone clearify that pls?

Solution

Nvidia unfortunately uses the "Tesla" name for two different things:

It is the code name for the architecture used by GeForce 200 series GPUs.
Nvidia also subsumes their high performance computing lineup under the marketing name Tesla.

CUDA is the name for the software that allows to do general purpose computations on Nvidia GPUs.

SM, SIMT, Thread synchronization, Shared Memory, Warp etc. really refer to aspects of the hardware architecture used by Nvidia GPUs for the past 10 years or so. However you will come across them a lot when programming with CUDA as the programming model closely reflects these hardware aspects for maximum performance.