Search code examples
azureparquetazure-synapseazure-synapse-analytics

Emoticons or Emoji issue in Azure synapse for Parquet data


I have a parquet data there are some emoticons when I opened this in any online parquet viewer it showing data with emoticons same data if query in synapse instead of emoticons it shows (?? or \uD83E\uDD73).

Is there any suggestion on this?

I am expecting same emoji's can be seen in synapse workspace as well.


Solution

    • You should be able to view emojis in Delta & parquet format in Synapse notebooks.
    • Delta Lake is a storage layer that runs on top of your existing data lake and is fully compatible with Synapse.
    • To ensure that the emojis are displayed properly in Synapse notebooks, make sure you are using a notebook environment that supports Unicode characters and has the necessary font support for rendering emojis.
    • Additionally, ensure that the data is stored in Delta format correctly and that the appropriate encoding is used when reading and writing the data.
    • You can view and work with emojis in Parquet format within Azure Synapse Analytics. Synapse Analytics supports the storage and processing of various data formats, including Parquet.

    As an example I have tried the below approach writing the emoji data to Parquet & Delta Format and reading them.

    from pyspark.sql import SparkSession
    spark = SparkSession.builder \
        .appName("EmojisToDeltaExample") \
        .getOrCreate()
    data = [('😊', 'This is a smiling emoji'),
            ('❤️', 'This is a heart emoji'),
            ('👍', 'This is a thumbs-up emoji')]
    df = spark.createDataFrame(data, ['Emoji', 'Description'])
    parquet_file_path = "abfss://[email protected]/example.parquet"
    df.write.parquet(parquet_file_path)
    delta_file_path = "abfss://[email protected]/example_delta"
    df.write.format("delta").save(delta_file_path)
    

    enter image description here

    enter image description here

    I have Queried the parquet data as Exteranl table from ADLS

    enter image description here

    I have used the Parquet file in the Copy activity to preview the data:

    enter image description here