Search code examples
logstashairflowairflow-schedulerairflow-2.x

remote logging doesn't work after upgrading Airflow version


We are using Airflow v2.1.4 with Kubernetes executors.

Our k8s cluster pods stdout automatically shipped via filebeat -> logstash -> Elasticsearch (ELK).

In logstash we are creating the log_id field:

 mutate {
    copy => { "[log][offset]" => "offset" }
    # change format from "2020-04-13T17_50_00_plus_00_00" to "2020-04-13T17:50:00+00:00"
    gsub => [ "[kubernetes][labels][execution_date]", "_plus_", "+" ]
    gsub => [ "[kubernetes][labels][execution_date]", "_", ":" ]
    add_field => [ "log_id", "%{[kubernetes][labels][dag_id]}-%{[kubernetes][labels][task_id]}-%{[kubernetes][labels][execution_date]}-%{[kubernetes][labels][try_number]}" ]
 }

now, the logs are available via airflow webserver - so far so go.

Now we upgraded Airflow to the latest version v2.2.3 and we lost our remote logs capability. Digging around, we found that the execution_date label removed in the latest versions (replaced with run_id). We saw some indications for that in the release notes and in the following PRs: https://github.com/apache/airflow/pull/19593/files, https://github.com/apache/airflow/pull/16666.

Seems like the dag_id value isn't accurate enough to extract the exact execution date (example: scheduled__2022-01-09T0020000000-8c05ec558).

How can I get my logs back in the latest version? any suggestion for workaround? another approach to retrieve the logs?

EDIT: seems like the documentation was updated (but the cfg example not) to:

log_id_template = {dag_id}-{task_id}-{run_id}-{try_number}

I'll try that and update


Solution

  • I decided to take a different approach and use a json logs instead. That way I don't have to deal with all the transformation and get the log_id ready.

    See more about it in the following answer: Airflow wrong log_id format.