Search code examples
pythondatabricksopen-telemetry

Events lost when sending them to Application Insights from Databricks notebooks


I've a job in Databricks that runs multiple tasks (some in parallel, others sequentially) and currently, I'm sending telemetry information using OpenCensus.

As support for OpenCensus will end on 30 September, I'm doing the transition to Azure Monitor OpenTelemetry Python Distro and with my current changes, not all telemetry events are sent.

I've installed the following libraries using pip:

azure-monitor-events-extension==0.1.0
azure-monitor-opentelemetry==1.6.1

In functions notebook, I've the following functions:

def get_applicationinsights_connection_string():
    return dbutils.secrets.get(get_parameter('subscription_key'), 'application-insights-connection-string')


def create_azure_connection():
    connection_string = get_applicationinsights_connection_string()
    configure_azure_monitor(connection_string=connection_string)


def send_custom_event(event_name, message_dict):
    print(f'Tracking "{event_name}"')
    create_azure_connection()
    track_event(event_name, message_dict)

Then, in each task of the job, I call the function send_custom_event(event_name, message_dict), so that the telemetry of each task is sent to the table customEvents in Application Insights.

The issue that I'm facing is that not all events are sent. Sometimes I have received the event for the first task but not for some of the parallel tasks. Other times I have not received the event of the first task nor some of the parallel tasks.

Why is this happening? Is there a way to do flush() to force the event to be sent? That option was available when using OpenCensus and the sending of events works perfectly.


Solution

  • Finally, I solved my problem using force_flush(), as helped here: https://github.com/Azure/azure-sdk-for-python/issues/37228