I'm wondering if there is any fast bit twiddling trick to do a saturating conversion from a 64-bit unsigned value to a 32-bit unsigned value (it would be nice if it is generalized to other widths but that's the main width I care about). Most of the resources I've been able to find googling have been for saturating arithmetic operations.
A saturating conversion would take a 64-bit unsigned value, and return either the value unmodified as a 32-bit value or 2^32-1 if the input value is greater than 2^32-1. Note that this is not what the default C casting truncating behaviour does.
I can imagine doing something like:
But I don't know how to quickly generate the mask. I tried straightforward branching implementations in Godbolt to see if the compiler would generate a clever branchless implementation for me but no luck.
#include <stdint.h>
#include <limits.h>
// Type your code here, or load an example.
uint32_t square(uint64_t num) {
return num > UINT32_MAX ? UINT32_MAX : num;
}
Edit: my mistake, issue was godbolt not set to use optimizations
You don't need to do any fancy bit twiddling trick to do this. The following function should be enough for compilers to generate efficient code:
uint32_t saturate(uint64_t value) {
return value > UINT32_MAX ? UINT32_MAX : value;
}
This contains a conditional statement, but most common CPUs, like AMD/Intel and Arm ones, have conditional move instructions. So they will test the value for overflowing 32 bits, and based on the test they will replace it with UINT32_MAX
, otherwise leave it alone. For example, on 64-bit Arm processors this function will be compiled by GCC (to:
saturate:
mov x1, 4294967295
cmp x0, x1
csel x0, x0, x1, ls
ret
Note that you must enable compiler optimizations to get the above result.