I'm trying to import a .csv
file into a Microsoft SQLServer database using Azure Data Factory. The .csv
file headers look like this
VehicleVIN, VehicleMake, VehicleModel, CustomerName, CustomerAddress
I've sinked the VehicleVIN
, VehicleMake
and VehicleModel
into a VehicleTable
. I've sinked the CustomerRef
, CustomerName
and CustomerAddress
into a CustomerTable
.
Since this is a many to many relationship between those 2 entities, I need to add a third table named VehicleCustomerTable
(Id
, VehicleId
, CustomerId
) where I should save the id's of the entities that I've sinked into VehicleTable
and CustomerTable
.
The problem is that I have no idea of how to associate them in the same data flow. Do I need another data flow for this one? If a separate data flow is needed, how do I know what customer belongs to what vehicle? I've already built a data flow that sinks the data to the corresponding tables but the problem is the association of these 2 entities.
Below is the input data.
Here, I added new column CustomerRef using surrogatekey
since you not given in input.
Use customerid if you have in your column.
Next, I have done group by using Aggregate component on VehicleVIN and CustomerRef.
Then for aggregate i just used tmp as column and max(CustomerRef)
as value, later I unselected this column in select component. This done just to avoid error.
Below are the settings.
Under Groub by
Under Aggregates
Next, add surrogatekey
component for creating Id column.
settings:
Preview:
Then unselect the tmp column using select
component as below.
Final data preview you will get.
At the end sink this data to sql table named VehicleCustomerTable.