Search code examples
unicodeprestoamazon-athenatrino

Convert varbinary to varchar with encoding in presto sql and AWS athena


I'm using AWS Athena.

I have a string field which holds base64 encoding of a DOMString by javascript's btoa (so, not utf-8 string but instead, 16-bit-encoded string).

So, the string Fútbol España is stored as Rvp0Ym9sIEVzcGHxYQ== (and not RsO6dGJvbCBFc3Bhw7Fh which is the base64 of the utf-8 encoding)

How can I decode this string in AWS athena (presto) SQL? if I use

select from_utf8(from_base64('Rvp0Ym9sIEVzcGHxYQ=='))

I get F�tbol Espa�a... is there a from_ascii or something similar, which accepts varbinary and encoding and performs the decode?


Solution

  • Unfortunately, I don't think there is a way to do this in Presto today, but I filed an issue to add it: https://github.com/prestosql/presto/issues/1035