Assuming the following data architecture: Source Systems -> Data Warehouse (using the data vault model) -> Data Virtualization -> Consumption Layer (e.g., BI Tools & reporting)
I read that for data vault, one of the key principles is to load raw data and keeping records from all sources - so no de-dupping or transformations for traceability/auditing purposes. If this is true, where would the transformations happen?
Yes, it is true, the "raw" data vault keeps records as it was on source system when it was loaded.
But there's another concept, the "business" data vault. This is where all the logic and transformation happens. The business data vault is not a full copy of the raw data vault, but you create hub/link/sat/pit/bridge to implement the logic to suit your needs.
That way, it helps you in the long run. If, for example, you need to change a business rule next year, you still have the original data for a particular source system at a particular time in the past. If your logic has a bug, you still have the original data.