Search code examples
c++armsimdarmv8sve

ARM V-8 with Scalable Vector Extension (SVE)


I come across this point that ARMv8 is now supporting variable length vector register from 128 bits to 2048 bits (scalable vector extension SVE). It is always good to have bigger width of register to achieve the data level parallelism. But on what basis we need to select the size of register from 128 bits to 2048 bits for achieving maximum performance?

For example I want to do Sobel filtering with 3x3 mask on 1920 X 1080 Y image. What register width do I need to select?


Solution

  • The Scalable Vector Extension is a module for the aarch64 execution state that extends the A64 Instruction Set and is focused on High-Performance Computing and not on media, for that you have NEON.

    The registers width will be decided by the Hardware designer/manufacturer depending on what that implementation is trying to solve/do. The possible vector length are: 128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048.

    From the programmers' point of view, the programming model is Vector Length Agnostic, meaning that the same application will work on implementations with different registers width (Vector lengths).

    The specification is out, however, there is no hardware available with SVE implemented. For the time being, you can use the ARM Instruction Emulator (armie) to run your programs.

    So answering your question, unless you are manufacturing hardware, you need not select any specific vector length, as that would vary from one implementation to another. Now if you are testing using armie, then you can select whichever your want.