Is it ok to use encrypted - hashed e-commerce customer email as Google Analytics User ID? I found different privacy policy sections about the use of PII in Google Analytics. For example here it says , it is ok to use the encrypted hashed form of the Data . But here in the caution section it says we are not allowed to use the PII data. I will be using Measurement Protocol and GTM for sending the data to Google Analytics.
If I use a proper level of encryption + hashing , will that be ok to use the customer email address (in hashed encrypted form) as the User ID in google analytics?
Regards, Lina
Yes it is OK to use SHA256-hashed PII data like you pointed out as hashing destroys the original data, thus it's no longer PII: cryptographic hash functions such as SHA256 are one-way functions, thus from the output you can't figure out the input (FYI you can brute-force the generation of inputs matching a given output - especially with weaker algorithms such as MD5
- to break into a system - eg guessing a password - but for the purpose of hiding PII it still does its job: you simply cannot know with certainty what the original PII was, so mission accomplished as far as protecting PII).
The only downside with using hashing to generate a User ID is collision: SHA256 produces 2^256 possible outputs, so if you're really unlucky (# emails / 2^256 = chance of collision) it's possible that different emails produce the same SHA-256 hash and thus the same User ID in which case different users will be incorrectly identified as the same user. To reduce chances of collision you could combine the hash with other attributes, eg {user_signup_timestamp}-{email_hash}
, but the only way to prevent collision is to rely on a database ID for each user as the DB will ensure each User ID is unique.