I hope someone will be able to help me. I am trying to learn Apache Nifi by doing some project where I have json files in the following format:
{
"network": "reddit",
"posted": "2021-12-24 10:46:51 +00000",
"postid": "rnjv0z",
"title": "A gil commission artwork of my friends who are in-game couples!",
"text": "A gil commission artwork of my friends who are in-game couples! ",
"lang": "en",
"type": "status",
"sentiment": "neutral",
"image": "https://a.thumbs.redditmedia.com/ShKq9bu4_ZIo4k5QIBYotstmyGidRgn8046RcqPo_p0.jpg",
"url": "http://www.reddit.com/r/ffxiv/comments/rnjv0z/a_gil_commission_artwork_of_my_friends_who_are/",
"user": {
"userid": "Suhteeven",
"name": "Suhteeven",
"url": "http://www.reddit.com/user/Suhteeven"
},
"popularity": [
{
"name": "ups",
"count": 1
},
{
"name": "comments",
"count": 0
}
]
}
I want to remove all non-alphanumeric characters from "text" attribute. I want only this one attribute to be modified, while the rest of the filename remains the same.
I tried using EvaluateJsonPath processor where I added text attribute. Then I created ReplaceText processor.
This configuration cleaned special characters from the text but as a result I have only value from text attribute. I don't want to loose other information, my goal is to have all attributes in the output with text attribute's value modified.
I tried also UpdateAttribute processor but this processor didn't do anything with my json (output is the same as input).
Can you please tell me what processors I should use with what configurations? I tried many different things but I am stucked.
It's possible with a processor ScriptedTransformProcessor
Record Reader: JsonTreeReader
Record Writer: JsonRecordSetWriter
Script Language (default): Groovy
Script Body
record.setValue("text", attributes['text'])
record
Data flow: EvaluateJsonPath (evaluate text
attribute) -> UpdateAttribute (modify text
attribute) -> ScriptedTransformProcessor (add text
to record)