I am writing a c++ arbitrary integer library as an homework. I represented numbers internally as a vector of unsigned int, in base 10^n, where n is as big as possible while fitting into a single unsigned int digit.
I made this choice as a trade-off between space, performance and digit access (it allows me to have much better performances than using base 10, without adding any complexity when converting to a human readable string).
So for example:
base10(441243123294967295) 18 digits
base1000000000(441243123,294967295) 2 digits (comma separated)
Internal representation with uint32
[00011010 01001100 11010101 11110011] [00010001 10010100 11010111 11111111]
To complete the homework I have to implement bit shifting and other bitwise operators. Does it make sense to implement shift for a number with such an internal representation?
Should I change to a base 2^n so that all bits of the internal representation are significative?
You can, but you do not have to: bit shifting will double the number, no matter what base you use for interpreting it later, because internally these int
s are still interpreted as binary by the underlying shift operations. Your implementation will have to decide on the tradeoff there, because your shifting will become harder to implement. On the other hand, the printing in base-10 will remain simpler.
Another solution favoring the decimal system that you may consider is using binary-coded decimals (BCD). Back in the day, hardware used to accelerate these operations (e.g. 6502, the CPU of Apple-2) included special instructions for adding bytes in the BCD interpretation. You would have to implement special correction if you use this representation, but it may be a worthy learning exercise.