Search code examples
armneoninstruction-set

why is a vdiv instruction generated with neon flags?


I disassembled an arm binary previously compiled with neon flags:

-mcpu=cortex-a9 -mfpu=neon -mfloat-abi=softfp -ftree-vectorize

The dump shows a vdiv.f64 instruction generated by the compiler. According to the arm manual for armv7 (cortex-a9) neon simd isa does not support vdiv instruction but the floating point (vfp) engine does. Why is this instruction generated? is it then a floating point instruction that will be executed by the vfp? Both neon and VFP support addition and multiplication for floating point so how can I differenciate them from eahc other?


Solution

  • In the case of Cortex-A9, the NEON FPU option also implements VFP; it is a superset of the cut-down 16-register VFP-only FPU option.

    More generally, the architecture does not allow implementing floating-point Advanced SIMD without also implementing at least single-precision VFP, therefore GCC's -mfpu=neon implies VFPv3 as well. It is permissible to implement integer-only Advanced SIMD without any floating-point capability at all, but I'm not sure GCC can support that (or that anyone's ever built such a thing).

    The actual VFP and Advanced SIMD variants of instructions are unambiguous from the syntax - anything operating on double-precision data (i.e. <op>.F64) is obviously VFP, as Advanced SIMD doesn't support double-precision. Single precision operations (i.e. <op>.F32) operating on 32-bit s registers are scalar, thus VFP; if they're operating on larger 64-bit d or 128-bit q registers, then they are handling multiple 32-bit values at once, thus are vectorised Advanced SIMD instructions.