My client is a Python programmer and I have created a C++ backend for him which includes license generation and checking. For additional safety, the Python front-end will also perform a validity check of the license.
The license generation and checking algorithm however is based on hashing methods which rely on the fact that an integer is of a fixed byte size and bit-shifting a value will not extend the integers byte count.
This is a simplified example code:
unsigned int HashString(const char* str) {
unsigned int hash = 3151;
while (*str != 0) {
hash = (hash << 3) + (*str << 2) * 3;
str++;
}
return hash;
}
How can this be translated to Python? The direct translation obviously yields a different result:
def hash_string(str):
hash = 3151
for c in str:
hash = (hash << 3) + (ord(c) << 2) * 3
return hash
For instance:
hash_string("foo bar spam") # 228667414299004
HashString("foo bar spam") // 3355459964
Edit: The same would also be necessary for PHP since the online shop should be able to generate valid licenses, too.
The issue here is that C's unsigned int
automatically rolls over when it goes past UINT_MAX
, while Python's int
just keeps getting bigger.
The easiest fix is just to correct at the end:
return hash % (1 << 32)
For very large strings, it maybe a little faster to mask after each operation, to avoid ending up with humongous int
values that are slow to work with. But for smaller strings, that will probably be slower, because the cost of calling %
12 times instead of 1 will easily outweigh the cost of dealing with a 48-bit int.
PHP may have the same problem, or a different one.
PHP's default integer type is a C long. On a 64-bit Unix platform, this is bigger than an unsigned int
, so you will have to use the same trick as on Python (either %
or &
, whichever makes more sense to you.)
But on a 32-bit Unix platform, or on Windows, this is the same size as unsigned int
but signed, which means you need a different trick. You can't actually represent, say, 4294967293
directly (try it, and you'll get -3
instead). You can use a GMP
or BCMath
integer instead of the default type (in which case it's basically the same as in Python), or you can just write custom code for printing, comparing, etc. that will treat that -3
as if it were 4294967293
.
Note that I'm just assuming that int
is 32 bits, and long
is either 32 or 64, because that happens to be true on every popular platform today. But the C standard only requires that int
be at least 16 bits long, and long
be at least 32 bits and no shorter than int
. If you need to deal with very old platforms where int
might be 16 bits (or 18!), or future platforms where it might be 64 or more, you have to adjust your code appropriately.