I would like to call an API to enrich an existing dataset.
The existing dataset is a CSVDataSet
configured in the catalog.
Now I would like to create a Node, that enriches the CSVDataSet
with data from the API, that I have to call for every row in the CSV file. Then save the data into a database (SQLTableDataSet
). My approach is to create an APIDataSet
entry in the catalog and provide it as an input for the node, next to the CSVDataSet
.
The issue here is, the APIDataSet
is static (in general the DataSets seem to be very static). I need to call the load function at runtime within the Node for every entry in the csv file.
I didn't find a way to do this. Is it just a bad approach? Do I have to call the API within the Node instead of creating a APIDataSet
?
I have done this in my GDALRasterDataSet
implementation. The idea is that if you need to enrich a dataset on the go, you can overload the load()
method in a custom dataset and pass additional parameters there.
You can see an implementation here and an example of usage here.
The only extra thing you need to do is to re-write the load()
method to accept kwargs
(line 143) and write your own _load
method that enriches your dataset. Everything else is boilerplate.