I am presently learning CUDA and I keep coming across phrases like
"GPUs have dedicated memory which has 5–10X the bandwidth of CPU memory"
See here for reference on the second slide
Now what does bandwidth really mean here? Specifically, What does one mean by
My very very limited understanding of bandwidth is the highest possible number of gigabytes that can be trasnferred per second from the CPU to the GPU. But that does not explain why we need to define three types of bandwidth.
There are three different memory buses in a current CPU/GPU system with discrete GPU:
Each of these buses has a physical bus width (in bits), a clock speed (how many times per second the data signals on the bus can be changed), and bandwidth (aka throughput), in bits per second (which can be converted to gigabytes per second). The peak bandwidth is determined by the bus width multiplied by the clock rate of the bus. Achievable bandwidth must also take into account any overhead (e.g. PCI-e packet overhead).