Arm has FCVTNS (scalar)
instruction, which is (emphasis added):
Floating-point Convert to Signed integer, rounding to nearest with ties to even (scalar).
A simple question: Does x86 have equivalent of Arm FCVTNS (scalar)
?
I've already quickly went through the list of x86 instructions, but couldn't find what I'm looking for. There is a usual CVTTSS2SI
, which rounds toward zero (when a conversion is inexact), which is not what I'm looking for.
The non-truncating cvtss2si
uses the current rounding mode, which is usually nearest-even but can be changed (in the MXCSR, which is what fenv.h
affects on x86-64). Same for packed conversions like cvtps2dq xmm,xmm
.
The truncating versions exist because C specifies that (int)my_float
uses truncation. With legacy x87 (before SSE3 fisttp
), compilers had to change the x87 rounding mode to truncation and back around every conversion, which sucked a lot.
If you need round-to-nearest-even in code that will run with a different rounding mode in MXCSR, you could use AVX-512 vcvtss2si eax, xmm0, {rn-sae}
(NASM syntax) to override the rounding mode for that instruction.
Without AVX-512, you could have different rounding modes set in MXCSR and the x87 control word, if you need different rounding in the same loop. (movss
store / fld dword
reload / fistp
conversion to integer with the current x87 rounding mode is probably more efficient than ldmxcsr
twice per iteration without a lot of unrolling. (From two saved values generated with stmxcsr
.)
(ldmxcsr
is 4 uops on Skylake / Alder Lake, but only 1 on Zen. Its throughput is a bit lower than you'd expect from the uop count and execution ports, though. See https://uops.info/)