I'm reading a data from Salesforce performing incremental upsert using pyspark SQL and ADF pipelines...I want to validate data between source and destination when upsert is happening how can I achieve this?
To validate the rowcount, you can fetch the rowsRead and rowsCopied attributes from json output if you are using copy activity : Get count of records in source and sink in Azure data factory
To check the number of records inserted, you can get the incremental records from source and write a stored procedure to fetch out the number of records whose primary key is not already present in the sink table. This will give the count of records going to be newly inserted.
Similarly, To check the number of records updated, you can get the incremental records from source and write a stored procedure to fetch out the number of records whose primary key is already present in the sink table. This will give the count of records going to be updated.