How to implement dct when the input image size is not a scale of 8?

I learned that if one needs to implement dct on a image of size (H, W), one needs a matrix A that is of size (8, 8), and one needs to use this A to compute with a (8, 8) region F on the image. That means if the image array is m, one needs to compute m[:8, :8] first, and then m[8:16, 8:16], and so on.

How could I implement this dct when input image size is not a scale of 8. For example, when image size is (12, 12) that cannot hold two (8, 8) windows, how could I implement dct ? I tried opencv and found that opencv can cope with this scenario, but I do not know how it implemented it.

Solution

The 8x8 is called a "Minimum Coded Unit" (MCU) in the specification, though video enthusiasts call them "macroblocks".

Poorer implementations will pad to fill with zeroes - which can cause nasty effects.

Better implementations pad to fill by repeating the previous pixel from the left if padding to the right, or from above if padding downwards.

Note that only the right side and bottom of an image can be padded.