Search code examples
pythongoogle-bigquerylooker-studio

Emoji crashed when uploading to Big Query


Currently, I'm facing an issue with uploading (using python) EMOJI data to the BIG QUERY

This is sample code which I'm trying to upload to BQ:

 {"emojiCharts":{"emoji_icon":"\ud83d\udc4d","repost": 4, "doc": 4, "engagement": 0, "reach": 0, "impression": 0}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\udc49","repost": 4, "doc": 4, "engagement": 43, "reach": 722, "impression": 4816}} 
 {"emojiCharts":{"emoji_icon":"\u203c","repost": 4, "doc": 4, "engagement": 0, "reach": 0, "impression": 0}} 
 {"emojiCharts":{"emoji_icon":"\ud83c\udf89","repost": 5, "doc": 5, "engagement": 43, "reach": 829, "impression": 5529}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\ude34","repost": 5, "doc": 5, "engagement": 222, "reach": 420, "impression": 2805}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\ude31","repost": 3, "doc": 3, "engagement": 386, "reach": 2868, "impression": 19122}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\udc4d\ud83c\udffb","repost": 5, "doc": 5, "engagement": 43, "reach": 1064, "impression": 7098}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\ude3b","repost": 3, "doc": 3, "engagement": 93, "reach": 192, "impression": 1283}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\ude2d","repost": 6, "doc": 6, "engagement": 212, "reach": 909, "impression": 6143}} 
 {"emojiCharts":{"emoji_icon":"\ud83e\udd84","repost": 8, "doc": 8, "engagement": 313, "reach": 402, "impression": 2681}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\ude18","repost": 7, "doc": 7, "engagement": 0, "reach": 8454, "impression": 56366}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\ude05","repost": 5, "doc": 5, "engagement": 74, "reach": 1582, "impression": 10550}} 
 {"emojiCharts":{"emoji_icon":"\ud83d\ude04","repost": 5, "doc": 5, "engagement": 73, "reach": 3329, "impression": 22206}}

Issues is that big query cannot see any of this emoji (\ud83d\ude04) and will display only in this format (\u203c)

Even if the field is STRING it displays 2 black rombs, why BQ cannot display emoji as a string without converting it to the actual emoji?

Questions:

Is there are any way to upload EMOJI to Big Query that it will load up correctly? - "will be used in Google Data Studio"

Should I manually (hardcoded) change all emoji code the acceptable ones, which is the acceptable format?


Solution

  • As user 'numeral' mentions in their comment:

    Check out charbase.com/1f618-unicode-face-throwing-a-kiss What you want is to convert the javascript escape characters to actual unicode data.

    , you need to change the encoding of the emojis for them to be accurately represented as one character:

    SELECT "\U0001f604 \U0001f4b8"
    --   , "\ud83d\udcb8"
    --   , "\ud83d\ude04"
    

    The 2nd and 3d line fail with an error like Illegal escape sequence: Unicode value \ud83d is invalid at [2:7], but the first line gives the correct display in BigQuery and Data Studio:

    enter image description here

    enter image description here

    Additional thoughts about this: