Search code examples
pysparkazure-cosmosdbazure-databricks

Batch writes to Cosmos DB from Databricks


Can someone let me know what asterics ** achieves when writing to Cosmos DB from Databrick.

# Write configuration
writeConfig = {
    "Endpoint": "https://doctorwho.documents.azure.com:443/",
    "Masterkey": "YOUR-KEY-HERE",
    "Database": "DepartureDelays",
    "Collection": "flights_fromsea",
    "Upsert": "true"
}

# Write to Cosmos DB from the flights DataFrame
flights.write.format("com.microsoft.azure.cosmosdb.spark").options(
    **writeConfig).save()

Thanks


Solution

  • This is simply to allow you to pass multiple arguments directly using a list, tuple or a dictionary in your case.

    So rather than you say:

    flights.write.format("com.microsoft.azure.cosmosdb.spark")\
                 .option("Endpoint", "https://doctorwho.documents.azure.com:443/")\
                 .option("Upsert", "true")\
                 .option("Masterkey", "YOUR-KEY-HERE")\
                 ...etc 
    

    You simply have all your arguments in a dictionary and then pass it like the following

    flights.write.format("com.microsoft.azure.cosmosdb.spark").options(
        **yourdict).save()