I am new in Talend and ask for help to more experienced.
I am trying to download data from a webservice RESTFul to a MongoDB database, but the original JSON format received doesn't allow the direct insertion in the tables as desired, I need to convert this JSON before to enter into a collection, so that through the MongoDB I can do searches by CODE and NAME keys.
I accessed some posts on this subject (JSON parser for Talend, tRest and map JSON with tExtractJSONField with TALEND, How to Parse Json in Talend using tExtractJsonField), but I could not make it work. I tried to use the tFileInputJSON, tMap, tExtractJSONFields, but without success.
One of the JSON that I need, I get has the following format:
[
{"CODIGO":"907","NOME_COMPLETO":"Ag\u00eancia do IBGE em Ceres"},
{"CODIGO":"905","NOME_COMPLETO":"Ag\u00eancia do IBGE em Goi\u00e1s"},
{"CODIGO":"910","NOME_COMPLETO":"Ag\u00eancia do IBGE em Ipor\u00e1"},
{"CODIGO":"919","NOME_COMPLETO":"Ag\u00eancia do IBGE em Itumbiara"},
{"CODIGO":"911","NOME_COMPLETO":"Ag\u00eancia do IBGE em Jaragu\u00e1"},
]
I do not know if because of the "[" and " ]" present at the beginning and end of the file, the Talend components do not allow you to view the file schema. For example, getting this JSON by tRest, it transforms this format:
{"data":[
{"Body":"[
{\"CODIGO\":\"907\",\"NOME_COMPLETO\":\"Ag\\u00eancia do IBGE em Ceres\"},
{\"CODIGO\":\"905\",\"NOME_COMPLETO\":\"Ag\\u00eancia do IBGE em Goi\\u00e1s\"},
{\"CODIGO\":\"910\",\"NOME_COMPLETO\":\"Ag\\u00eancia do IBGE em Ipor\\u00e1\"},
{\"CODIGO\":\"919\",\"NOME_COMPLETO\":\"Ag\\u00eancia do IBGE em Itumbiara\"},
{\"CODIGO\":\"911\",\"NOME_COMPLETO\":\"Ag\\u00eancia do IBGE em Jaragu\\u00e1\"}
]",
"ERROR_CODE":null
}
]
}
Thus, the tExtractJSONFields only recognizes the "Body" tag, and can't have access to the "CODIGO" tag, not to mention the double quotes added by component.
When I insert this content in MongoDB (using tMongoDBOutput), it created a unique record, and does not allow me to search by CODIGO, for example
What I have to do to have access to tags CODIGO and NOME_COMPLETO?
First, let me recommend http://jsonpath.com, which I always use to check the correctness of my JSON path query.
Add another tExtractJSONFields after the one you use to extract the Body
tag. Then you can extract the fields CODIGO
and NOME_COMPLETO
. Alternatively, the JSON path loop query "$.[*]"
(which I tried on said web page) in a tFileInputJSON might also be able to do what you want: