I have a parent and child pipeline.
My task is to call the rest API endpoints such as: https://www.alphavantage.co/query?function=INCOME_STATEMENT&symbol=IBM&apikey=demo and store the data in Parquet format in adls. For every symbol such as "IBM," I need to modify the API URL for BALANCE_SHEET, and CASH_FLOW and fetch data and store it in respective data folders such as balance sheet, and cash flow.
My current pipelines: My parent pipeline has a lookup activity for each activity where the lookup activity fetches symbols from a csv file. The FOR each iterates and executes a child pipeline which has another for each activity within which there is a dataflow. I need help in iterating the child pipeline over an array such as = ["BALANCE_SHEET", "CASH_FLOW", "INCOME_STATEMENT"].
I tried creating a parameter as an array type and set for each activity to iterate over the array however , when I try to pass the current value to the rest dataset, it passes the entire array and I cannot reference the current item to modify the string to dynamically place the parquet files. ex: "BALANCE_SHEET" to "balance sheet"
I Tried creating an array variable but cannot reference it in parametrized datasets.
You need to use dataset parameters to achieve your requirement.
Create string type parameter symbol
in the source dataset(REST) and use it in the Relative URL of the dataset.
&symbol=@{dataset().symbol}&apikey=demo
Similarly, create a parameter and use it in the folder name of the parquet dataset as well.
Use both datasets as source and sink of the dataflow. Inside the inner ForEach, provide the necessary values for the parameters in the dataflow activity like below.
Here, for folder name, use the expression @toLower(replace(item(),'_',''))
.
This will iterate through the provided array ["BALANCE_SHEET", "CASH_FLOW", "INCOME_STATEMENT"]
and the parameters will build the necessary URLs and folders dynamically.