Search code examples
pysparkazure-synapsedelta-lake

Synapse Delta tables - reading the latest version of a record


In Synapse Delta tables are great for ETL because they allow merge statements on parquet files. I was wondering how to use delta table to our advantage for reading the data as well when we load Silver to Gold. is there a way, in Synapse, to read the 'latest' version of each key?

The link below is for an approach for Azure Databricks. I could not get it to work in a PySpark notebook. I was wondering if there is a similar approach for Synapse Delta

How to fetch the latest version number of a delta table


Solution

  • Yes, I agree with wBob,

    I reproduce the same thing in the environment with the azure synapse Delta table.

    Try this Code to get latest version of a record:

    from pyspark.sql import SparkSession
    
    df1 = spark.read \
        .format("delta") \
        .option("startingVersion", "latest") \
        .option("readChangeFeed", "true") \
        .table('sample_table')
    
    df1.show()
    

    enter image description here