I just found this bug where I'm calling
MyJob.perform_later(request.body.read)
with a sidekiq active_job job,
the call request.body.read
returns some json, I figured that in some cases it might contain chars that are UTF-8 (i.e. € symbol),
in this case I'm getting
Encoding::UndefinedConversionError Exception: "\xE2" from ASCII-8BIT to UTF-8
I'm aware that sidekiq advises not to have complex or long job parameters, what would be a best practice workaround?
what I can think of is to base64 encode the string before passing it to the job (but this would make it even longer for sidekiq, I'm not sure this would be a problem) or store the actual json text in a db table, and just pass to the job the id of the new row. this would definitely work, but looks like an overkill to me.
any suggestions?
Sidekiq is going to use JSON.generate
to serialize the job arguments. This is an example of what is happening to your ASCII-8BIT
string that you can run in the console:
arg = "Example with € character".force_encoding('ASCII-8BIT')
JSON.generate([arg])
Encoding::UndefinedConversionError ("\xE2" from ASCII-8BIT to UTF-8)
One option would be to follow this answer and force the encoding to UTF-8 before you pass it into perform_later
. Then it will serialize correctly:
arg = "Example with € character".force_encoding('ASCII-8BIT')
arg.force_encoding('UTF-8')
JSON.generate([arg])
=> "[\"Example with € character\"]"
So you'd want something like:
MyJob.perform_later(request.body.read.force_encoding('UTF-8'))