I have json array response from InvokeHTTP. I am using the below Flow to convert some json info to csv. One of the json info is id which is used to get image and then convert it to base64. I need to add this base64 code to my csv. I don't understand how to save it in an attribute so that it can be put in AttributeToCsv.
Also, I was reading here https://community.cloudera.com/t5/Support-Questions/Nifi-attribute-containing-large-text-value/td-p/190513 that it is not recommended to store large values in attributes due to memory concern. What would be an optimal approach in this scenario.
Json response during first call:
[ {
"fileNumber" : "1",
"uuid" : "abc",
"attachedFiles" : [ {
"id" : "bkjdbkjdsf",
"name" : "image1.png",
}, {
"id" : "xzcv",
"name" : "image2.png",
} ],
"date":null
},
{ "fileNumber" : "2",
"uuid" : "def",
"attachedFiles" : [],
"date":null
}]
Final Csv (after merge or expected output):
Id,File Name, File Data(base64 code)
bkjdbkjdsf,image1.png, iVBORw0KGgo...ji
xzcv,image1.png,ZEStWRGau..74
My approach (will change as per suggestions):
After splitting Json response, I use EvaluateJsonPath to get "attachedFiles".
I find length of array "attachedFiles" and then decide if need to split further if 2 or more files are there. If 0 then do nothing. In second EvaluateJsonPath I add properties Id,File Name and set the values from json using $.id
etc.. I use the Id to invoke other URL which I encode to Base64.
Current output - csv file which needs to be updated with third column File Data(base64 code) and it's value:
Id,File Name
bkjdbkjdsf,image1.png
xzcv,image1.png
as a variant use ExecuteGroovyScript:
def ff=session.get()
if(!ff)return
ff.write{sin, sout->
sout.withWriter('UTF-8'){w->
//write attribute values for names 'Id' and 'filename' delimited with coma
w << ff.attributes.with{a->[a.'Id', a.'filaname']}.join(',')
w << ',' //wtite coma
//sin.withReader('UTF-8'){r-> w << r} //write current content of the file after last coma
w << sin.bytes.encodeBase64()
w << '\n'
}
}
REL_SUCCESS << ff
UPD: i put sin.bytes.encodeBase64()
instead of copying flowfile content. this one creates one-line base64 string for input file. if you are using this option - you should remove Base64EncodeContent to prevent double base64 encoding.