I would like to use Cloud Data Fusion to load data from multiple REST endpoints and store it in BigQuery in multiple tables (per endpoint).
I made it working using HTTP plugin as source and BigQuery sink. However I have to define pipeline for each endpoint, which is overkill, I suppose.
I noticed that Data Fusion has BigQuery Multi table sink, so I was expecting to connect multiple HTTP sources to it so that BigQuery create tables per each endpoint and load data into them. However when I run pipeline I am having error "Two different input schema were set for the stage BigQuery Multi Table". Apparently every endpoint has different schema.
Questions are: Is BigQuery Multi table sink appropriate to solve my problem? If yes, how should I configure it correctly to make it working?. If not, are there any other ways to do it other than defining pipeline per endpoint?
The BigQuery Multi Sink works with 2 types of inputs:
Since the input schema is different in this case, this plugin is not guaranteed to work in your scenario.
If you have a common field in both schemas that you can use as a Split Field, you can also try to enable the Allow flexible schemas in Output setting in your BigQuery Multi Table Sink. However, as mentioned before, this is not guaranteed to work.