I'm a new user of Pentaho and I'm trying to parse a JSON in a Kettle step.
I know how to parse all JSON fields that don't change, but there are some fields for which I cannot determine length because they are arrays. In the following example, look at the field "palavras_chave":
{
"identificacao": "Manejo Floinga. ",
"historico": "A técni.",
"descricao": "A.sasasa ",
"objetivos": "Existem. ",
"sustentabilidade": "Co.",
"vantagens_desvantagens": "VANTAGENS: resi",
"custos": "INVESTIMENTOS e CUSTOS: a",
"direitos": "Tecnologia livre. ",
"instituicao": "Tecnologia ",
"assistencia_manutencao_te": " ",
"experiencia": "Existem cerca de ",
"entraves_adocao": "ENTRAção. ",
"condicoes_requeridas": "Aio.",
"fornecedores": "Sódocumentlarizada.",
"usuarios": "Produtecolementar. ",
"programa": "Eme.",
"avaliacao_impacto": " reidos. ",
"transferencia_tecnologia": "públrsos.",
"outros": "Até 1000 cs",
"visualizacao_tecnologia": "Consu",
"palavras_chave": [
"Caaga",
"uso vel",
"padeireiros",
"manrestal"
],
"referencias": "Livro '.pdf",
"replicabilidade": "Atéa. ",
"fonte": "Meiro"
},
Another piece of code:
{
"identificacao": "Manejatinga. ",
"historico": "A técni.",
"descricao": "A.sasasa ",
"objetivos": "Existem. ",
"sustentabilidade": "Co.",
"vantagens_desvantagens": "VANNS: resi",
"custos": "INVESTUSTOS: a",
"direitos": "Tecnologia livre. ",
"instituicao": "Tecnologia ",
"assistencia_manutencao_te": " ",
"experiencia": "Existem cerca de ",
"entraves_adocao": "ENTRAção. ",
"condicoes_requeridas": "Aio.",
"fornecedores": "Sódocumeda.",
"usuarios": "Produtentar. ",
"programa": "Em áre.",
"avaliacao_impacto": " reduzidos. ",
"transferencia_tecnologia": "públicos diversos.",
"outros": "Até 1000 cs",
"visualizacao_tecnologia": "Cong",
"palavras_chave": [
"teste",
"aaaaaaa",
],
How can I parse input with variable length and work with this in Kettle? If I were programming in Python, I would simply interate over the array and do what I want with a loop inside a loop.
Is there any way to do this here? Is the concept wrong?
I found the answer, i was searching for a way to normalize my data( i didnt know that this was the term before ), a member called marabu from pentaho forum helped. The way to do this is simple, we wanting to normalize the data, we should mark the option "Rownum in output" and give it a name like below. The rownum option
After that i can use the rownum field to keep the reference to each id of each json, in this way i can insert the right reference , in the relational dbms. the step is here. the kettle step