x86 floating-point type-conversion rounding

Does x86 have equivalent of Arm FCVTNS (scalar)?

Arm has FCVTNS (scalar) instruction, which is (emphasis added):

Floating-point Convert to Signed integer, rounding to nearest with ties to even (scalar).

A simple question: Does x86 have equivalent of Arm FCVTNS (scalar)?

I've already quickly went through the list of x86 instructions, but couldn't find what I'm looking for. There is a usual CVTTSS2SI, which rounds toward zero (when a conversion is inexact), which is not what I'm looking for.

Solution

The non-truncating cvtss2si uses the current rounding mode, which is usually nearest-even but can be changed (in the MXCSR, which is what fenv.h affects on x86-64). Same for packed conversions like cvtps2dq xmm,xmm.

The truncating versions exist because C specifies that (int)my_float uses truncation. With legacy x87 (before SSE3 fisttp), compilers had to change the x87 rounding mode to truncation and back around every conversion, which sucked a lot.

If you need round-to-nearest-even in code that will run with a different rounding mode in MXCSR, you could use AVX-512 vcvtss2si eax, xmm0, {rn-sae} (NASM syntax) to override the rounding mode for that instruction.

Without AVX-512, you could have different rounding modes set in MXCSR and the x87 control word, if you need different rounding in the same loop. (movss store / fld dword reload / fistp conversion to integer with the current x87 rounding mode is probably more efficient than ldmxcsr twice per iteration without a lot of unrolling. (From two saved values generated with stmxcsr.)

(ldmxcsr is 4 uops on Skylake / Alder Lake, but only 1 on Zen. Its throughput is a bit lower than you'd expect from the uop count and execution ports, though. See https://uops.info/)