I have a question about utf16_t
character interaction and SHA-256 generation with OpenSSL.
The thing is, I'm currently writing code that should deal with password hashing. I've generated a 256-bit hash, and I want to throw it into the database in a UTF-16 encoded character field. In my C++ code, I use char16_t
to store such data. However, there is a problem. utf16_t
can have more than 16 bytes, depending on the machine it ends up on. And if I use memcpy()
to copy bytes from my SHA-256 hash, it may turn out to be a mess on some machines.
What should I do in this situation? Read bytes differently, store hashes in the database differently, maybe something else?
SHA256 generates 256 essentially random bits (32 bytes) of data. It will not always generate valid UTF-16 data.
You need to somehow encode the 32 bytes into more-than-32 utf-16 bytes to store in your database. Or you can convert the database field to a proper 256-bit binary type
One of the easier-to-implement ways to store it in your DB as a string would be to map each byte to a character 1-to-1 (and store 32 bytes of data with 32 bytes of zeroes in between):
unsigned char sha256_hash[256/8];
get_hash(sha256_hash);
// encoding
char16_t db_data[256/8];
for (int i = 0; i < std::size(db_data); ++i) {
db_data[i] = char16_t(sha256_hash[i]);
}
write_to_db(db_data);
char16_t db_data[256/8];
read_from_db(db_data);
// decoding
unsigned char sha256_hash[256/8];
for (int i = 0; i < std::size(sha256_hash); ++i) {
assert((std::uint16_t) db_data[i] <= 0xFF);
sha256_hash[i] = (unsigned char) db_data[i];
}
Be careful if you are using null-terminated strings though. You will need an extra character for the null terminator and map the 0 byte to something else (0x100
would be a good choice).
But if you have additional requirements (like it being readable characters), you might consider base64 or a hexadecimal encoding