I have just spun up a dataform project that is connected to a google cloud project. My question is, how do i go about formatting dataform to allow for multiple datasets which each contsain their own bespoke sql code?
example structure:
dataset 1
sql_1
sql_2
sql_3
dataset 2
sql_1
sql_2
sql_3
from what i have read so far, can i only interact with one dataset at a time?
The dataform.json
file allows me to set the following:
{
"defaultSchema": "test_datasets",
"assertionSchema": "dataform_assertions",
"warehouse": "bigquery",
"defaultDatabase": "project_name",
"defaultLocation": "EU"
}
However if we have more than one dataset in a project, do i need to alter the json file to set another data set? Or is there a better way to deal with gcp projects with multple datasets?
To do this, specify a schema in the config
block by adding the kv pair: schema: "your-schema-name"
. Refer here for more: https://docs.dataform.co/guides/datasets/publish#overriding-a-datasets-schema-or-name .