I have a luigi task which reads a .sql file and outputs to BigQuery.
My question is there any way I can reuse that same task with a different .sql file without having to copy the whole luigi task, i.e. I want to create instances of a template luigi task.
class run_sql(luigi.task):
sql_file = 'path/to/sql/file' # This is the only bit of code that changes
def complete(self):
...
def requires(self):
...
def run(self):
...
Building off of @matagus' answer, you can also subclass RunSql
to define a sql file, using the complete()
, requires()
, and run()
methods of the parent class.
class RunSqlFile(RunSql):
sql_file = '/path/to/file.sql`
Or you can use the @property
decorator to reference attributes of the RunSql
class. I often do this to set a directory, or other configuration data, in the parent class, then reference them in subclasses.
class RunSql(luigi.Task):
sql_file = luigi.Parameter()
def get_file(self, name):
default_dir = '/path/to/sql/dir'
return os.path.join(default_dir, name)
def requires(self):
...
class RunSqlFile(RunTask):
@property
def sql_file(self):
return self.get_file("query.sql")
And that will act as if you'd instantiated the class with --sql-file /path/to/sql/dir/query.sql