Search code examples
azureapache-sparkpysparkazure-synapseazure-log-analytics

Pyspark - Read Log Analytics table through a Synapse Notebook


I'm trying to create a pyspark dataframe reading a log analytics table from Synapse Notebook.

I used this code but without success:

df_lg_tb = spark.read.format("com.microsoft.kusto.spark.datasource") \
    .option("kustoCluster", "https://<workspace-id>.ods.opinsights.azure.com") \
    .option("kustoDatabase", "<my-log-analytics-database-name>") \
    .option("kustoQuery", "AzureActivity|take 10") \
    .option("kustoAADUserId", "<workspace-id>") \
    .option("kustoAADPassword", "<workspace-key>") \
    .load()

Does anyone know another method in order to connect?

Thanks a lot!


Solution

  • Before running the above code, make sure you have added the log analytics workspace connection

    https://ade.loganalytics.io/subscriptions/<subscription_id>/resourcegroups/<resource_group_name>/providers/microsoft.operationalinsights/workspaces/<workspace_name>
    

    to the kustos cluster like below.

    enter image description here

    Check you have all the required necessary permissions and roles and try to run the above code.

    If that doesn't work you can try the below approaches as a workaround.

    • Use Python SDK to read the table data by passing the query. Go through this Documentation samples to know about it.
    • First Export the log analytics table data to a storage account and then read the data from the storage account to synapse notebook by mounting. Refer this blog by @Shemer Steinlauf for the detailed steps.