I have this JSON input:
{
"entityType": "person",
"id": 24285,
"properties": {
"firstName": "Zia",
"lastName": "Rus",
"email": "ziarus@zia.com",
"phoneNumber": null,
"requestingUserGuid": "Cugxxasod Daspohs, Uyowiye",
"generalReportReference": 244,
"employment": [
{
"employerName": "Avature",
"startDate": "01-Jan-2003",
"endDate": "04-Nov-2008",
"jobTitle": ""
},
{
"employerName": "Avature",
"startDate": "06-Jul-2012",
"endDate": "",
"jobTitle": ""
}
],
"supervisor": "Supervisor",
"education": [
{
"institutionName": "Universitate",
"major": "gead",
"degree": "gased",
"startDate": "01-Feb-2022",
"endDate": "01-Oct-2024"
},
{
"institutionName": "College",
"major": "brup",
"degree": "brup",
"startDate": "05-Apr-2016",
"endDate": ""
}
],
"externalIdentifier": 24285,
"clientGuid": "6253d1c2-b02f-4513-8348-89db9b8ba449",
"productGuid": "e2f507b4-025c-439b-9ddb-833f9e537e60",
"applicantGuid": ""
}
}
So far I've managed to get the whole data transform as needed, but I'm missing the way to format the data fields to "YYYY-MM-DD" format. I've seen ways to format the data fields, but most of them are by defaulting values or overwriting data. The problem with those solutions, is that as you can see in the input JSON, I can have either one or multiple objects within the "employment" and "education" arrays, and I can either have an "EndDate" or not. Ergo, using the "defaulting data" solutions doesn't work for me.
This is the JOLT I'm currently using to transform the data:
[
{
"operation": "shift",
"spec": {
"properties": {
"firstName": "firstName",
"lastName": "lastName",
"email": "email",
"phoneNumber": "phoneNumber",
"requestingUserGuid": "requestingUserGuid",
"generalReportReference": "generalReportReference",
"clientGuid": "clientGuid",
"productId": "productId",
"applicantGuid": "applicantGuid",
"externalIdentifier": "externalIdentifier",
"employment": {
"*": {
"employerName|endDate": {
"": null,
"*": {
"@1": "&4[&3].&2"
}
},
"*": {
"*": {
"@1": "&4[&3].&2"
}
}
}
},
"education": {
"*": {
"institutionName|registrarPhone|endDate": {
"": null,
"*": {
"@1": "&4[&3].&2"
}
},
"*": {
"*": {
"@1": "&4[&3].&2"
}
}
}
}
},
"education": {
"*": {
"~institutionName": "N/A",
"~registrarPhone": "(111) 111-1111"
}
},
"employment": {
"*": {
"~employerName": "N/A"
}
}
}
},
{
"operation": "modify-default-beta",
"spec": {
"employment": {
"*": {
"employerName": "N/A",
"firstNameUsed": "@(4,firstName)",
"lastNameUsed": "@(4,lastName)"
}
},
"education": {
"*": {
"institutionName": "N/A",
"registrarPhone": "(111) 111-1111"
}
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"employment": {
"*": {
"supervisor": "Supervisor"
}
}
}
}
]
I can't see the way to be able to transform the "date" fields to the needed format. Does anybody can help me with this one? Is this doable with JOLT or should I make a different approach and try using another processor in NiFi?
Thanks in advance!
You can handle all date arrangements within a JoltTransformJSON
processor with the following transformation specs in which a split function of modify spec converts all date values to arrays to be used within the shift, in which the month abbreviations are matched with their numeric representations ,which follows it, and then combine the components of the date arrays by join function of modify spec again such as
[
{
"operation": "modify-overwrite-beta",
"spec": {
"properties": {
"*": {
"*": {
"*Date": "=split('-',@(1,&))"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&", //the elements other than "properties"
"properties": {
"*": "&1.&", //the elements other than "employment" and "education"
"emp*|edu*": {
"*": {
"*": "&3.&2[&1].&", //the elements other than "...Dates", reduce the levels by 1 w.r.t. the line 2 levels down
"*Date": {
"2": "&4.&3[&2].&1", //[&2] represents the indexes of the "employment" or "education" in order to provide array manner back
"1": { //Months part
"Jan": { "#01": "&6.&5[&4].&3" }, //leaf nodes are 2 levels deeper than the other Date arrays' nodes
"Feb": { "#02": "&6.&5[&4].&3" },
"Mar": { "#03": "&6.&5[&4].&3" },
"Apr": { "#04": "&6.&5[&4].&3" },
"May": { "#05": "&6.&5[&4].&3" },
"Jun": { "#06": "&6.&5[&4].&3" },
"Jul": { "#07": "&6.&5[&4].&3" },
"Aug": { "#08": "&6.&5[&4].&3" },
"Sep": { "#09": "&6.&5[&4].&3" },
"Oct": { "#10": "&6.&5[&4].&3" },
"Nov": { "#11": "&6.&5[&4].&3" },
"Dec": { "#12": "&6.&5[&4].&3" }
},
"0": "&4.&3[&2].&1"
}
}
}
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"properties": {
"*": {
"*": {
"*Date": "=join('-',@(1,&))"
}
}
}
}
}
]
the demo on the site https://jolt-demo.appspot.com/ is :