I am trying to create a UUID based on the md5 hash of a cell value in OpenRefine (using Jython) but I am having troubles passing the value to the function.
I am able to create UUID using the expression:
import uuid;
return str(uuid.uuid4());
but I want to use the md5 hash of the cell's value, so I tried to follow the formula
uuid.uuid3(namespace, name)
However, I am unable to pass the value to the function. The attempt:
import uuid;
return str(uuid.uuid3(uuid.NAMESPACE_DNS, value));
receive the following error:
Error: Traceback (most recent call last): File "", line 3, in temp_448166737 File "/Applications/OpenRefine 3.2b.app/Contents/Resources/webapp/extensions/jython/module/MOD-INF/lib/jython-standalone-2.7.1.jar/Lib/uuid.py", line 528, in uuid3 UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 1: ordinal not in range(128)
Without using the cell's value, the expression works quite well. The example
import uuid;
return str(uuid.uuid3(uuid.NAMESPACE_DNS, 'example'));
use the string "example" and compute the UUID c5e5f349-28ef-3f5a-98d6-0b32ee4d1743 for each cells. However, it is not the desired result.
Any ideas how to pass to Jython the value of the cell present in OpenRefine within an expression?
You just have to encode your unicode strings in value
with .encode('utf-8')
, as explained here:
import uuid
return str(uuid.uuid3(uuid.NAMESPACE_DNS, value.encode('utf-8')))