Search code examples
pythonluigi

re-use similar luigi tasks


I have a luigi task which reads a .sql file and outputs to BigQuery.

My question is there any way I can reuse that same task with a different .sql file without having to copy the whole luigi task, i.e. I want to create instances of a template luigi task.

class run_sql(luigi.task):
    sql_file = 'path/to/sql/file'  # This is the only bit of code that changes 
    def complete(self):
        ...
    def requires(self):
        ...
    def run(self):
        ...

Solution

  • Building off of @matagus' answer, you can also subclass RunSql to define a sql file, using the complete(), requires(), and run() methods of the parent class.

    class RunSqlFile(RunSql):
        sql_file = '/path/to/file.sql`
    

    Or you can use the @property decorator to reference attributes of the RunSql class. I often do this to set a directory, or other configuration data, in the parent class, then reference them in subclasses.

    class RunSql(luigi.Task):
        sql_file = luigi.Parameter()
    
        def get_file(self, name):
            default_dir = '/path/to/sql/dir'
            return os.path.join(default_dir, name)
    
       def requires(self):
            ...
    
    
    class RunSqlFile(RunTask):
    
        @property
        def sql_file(self):
            return self.get_file("query.sql")
    

    And that will act as if you'd instantiated the class with --sql-file /path/to/sql/dir/query.sql