Search code examples
armarm64neonsve

Software optimization guide for AArch64 Neon and SVE


There is ARM software optimization guide (e.g., https://developer.arm.com/documentation/swog309707/latest for neoverse n1).

This guide doesn't seem to contain the latency and throughput for Neon or SVE. Is there a separate guide for NEON or SVE (e.g., the instruction latency and throughput for INSR (SIMD&FP scalar) instruction)?

A pointer would be very helpful!


Solution

  • The timings for Neon instructions are in that document, listed under ASIMD (which is Arm's more formal name for that instruction set). See Sections 3.15 onward.

    There are no timings for SVE instructions because, as I understand it, the N1 simply doesn't support that extension. But if you look at the guide for some core that does support SVE, you'll see the timings included. For the Neoverse N2 they are from Section 3.26 onward.