I want to scale my input values between 0 and 32767. For context, this program is running on an ARM M0, and I would like to avoid any float operations. My input types are u32 and output types are u16. I have this so far, where I explicitly check for an overflow when doing a large multiplication. If not, I simply proceed to multiply the numbers. Is this method of checking for overflow an efficient way of achieving desired scaling?
uint16_t scaling(uint32_t in, uint32_t base)
{
uint32_t new_base = INT16_MAX; // 32767
if (in >= base) {
return (uint16_t)new_base;
}
if (in != ((in * new_base) / new_base) ) {
// overflow, increase bit width!
uint64_t tmp = in * (uint64_t)new_base;
tmp /= base;
return (uint16_t)(tmp);
}
// simply multiply
uint32_t tmp = (in * new_base) / base;
if (tmp > new_base) {
tmp = new_base; // clamp
}
return (uint16_t)tmp;
}
For example, for sample "large" input of (in=200000, base=860000) expected output is 7620 (which this program does give back)
The following variation on OP's code precalculates the threshold where in * new_base
overflows into 64-bit, and saves the final check if (tmp > new_base)
which is redundant once in < base
.
uint16_t scaling(uint32_t in, uint32_t base)
{
static const uint32_t new_base = INT16_MAX; // 2^15 - 1 = 32767
static const uint32_t in32 = (uint32_t)(((uint64_t)UINT32_MAX + 1) / INT16_MAX); // 2^32 / (2^15 - 1) = 131076
if (in >= base) {
return (uint16_t)new_base;
}
if(in < in32) {
return (uint16_t)((in * new_base) / base);
}
return (uint16_t)(((uint64_t)in * new_base) / base);
}
As a side comment, at least one compiler (running with the default optimizations) will replace the multiplication in * new_base
with the cheaper (in << 15) - in
when new_base = INT16_MAX
is declared const uint32_t
but not when declared as just uint32_t
.