I have a csv file with following data:
Id,Name,Type,date
1,name1,employee,25/04/2017
2,name2,contrator,26/04/2017
3,name3,employee,25/04/2017
4,name4,contrator,26/04/2017
5,name5,employee,24/04/2017
6,name6,contrator,24/04/2017
7,name7,employee,25/04/2017
8,name8,contrator,24/04/2017
9,name9,employee,24/04/2017
10,name10,contrator,26/04/2017
6,name6,employee,27/04/2017
11,name11,employee,27/04/2017
12,name12,contrator,27/04/2017
If It has two rows with same Id number. One of the row should be removed by checking the latest date. The row with older date should be removed. For example, above input has two rows of data with ID no 6. The row with date 24/04/2017 should be removed. The output should be like this
Id,Name,Type,date
1,name1,employee,25/04/2017
2,name2,contrator,26/04/2017
3,name3,employee,25/04/2017
4,name4,contrator,26/04/2017
5,name5,employee,24/04/2017
6,name6,employee,27/04/2017
7,name7,employee,25/04/2017
8,name8,contrator,24/04/2017
9,name9,employee,24/04/2017
10,name10,contrator,26/04/2017
11,name11,employee,27/04/2017
12,name12,contrator,27/04/2017
I need to achieve this using Dataweave. Please provide me a solution or suggestions
here is the dataweave you are looking for:
%dw 1.0
%output application/csv
%var toDate = (str) -> str as :date { format: "dd/MM/yyyy" }
%var maxDate = (a, b) -> a when toDate(a.date) > toDate(b.date) otherwise b
---
payload groupBy $.Id
pluck $ map ($ reduce ((val, acc) -> maxDate(val, acc)))