Search code examples
pythonluigi

Clearing the Luigi Task Visualizer Cache


I'm testing a pipeline with Luigi and I've noticed strange caching behavior in the task visualizer. For one thing, tasks seem to stay in the cache for a set time, sometimes overlapping with tasks from a second run of the pipeline, causing clutter in the UI. I've also noticed that when two pipelines are run in succession it takes a while for tasks from the new pipeline to appear. Is there a way to manually reset the cache before each run? Is there a configuration variable that sets how long tasks are cached before they expire?


Solution

  • You can use the remove_delay setting for the scheduler. In your config file:

    [scheduler]
    remove_delay = 10
    

    This applies to the scheduler so you need to restart luigid to enable it.

    From the doc:

    Number of seconds to wait before removing a task that has no stakeholders. Defaults to 600 (10 minutes).

    From experience, stakeholders in that case seem to mean workers and upstream/downstream dependencies.