https://mikefarah.gitbook.io/yq/ can convert xml files to json.
data.xml
which looks like:
<root>
<heading>
<member>one</member>
</heading>
<heading>
<member>two</member>
<member>three</member>
</heading>
</root>
Running yq --input-format xml . data.xml
outputs:
root:
heading:
- member: one
- member:
- two
- three
member
is outputted with two different types. After some digging I found:
"yq assumes consecutive nodes with the same name are assumed to be arrays. If there's only one node with a name, yq assumes its a map"
https://github.com/mikefarah/yq/issues/1583
So we can modify the query to be yq --input-format xml '.root.heading.[].member |= [] + .' data.xml
which outputs:
root:
heading:
- member:
- one
- member:
- two
- three
However this requires passing the exact path and key name for every override, and assumes you know the hierarchy of nodes in advance.
I am handling files handcoded by users, and the nesting is different for every file. I need a more dynamic yq expression which can match multiple map keys at any level.
yq has a Recursive Descent (Glob) operator https://mikefarah.gitbook.io/yq/operators/recursive-descent-glob
So far I have a query:
yq --input-format xml '(... | select(key == "member")) |= [] + .' data.xml
which outputs:
root:
heading:
-? - member
: - one
-? - member
: - two
- three
However it does not output the result I would expect. Where am I going wrong?
In your example, you actually only want to glob the value nodes (then test if their key
matches certain criteria), but you don't want to glob the key nodes themselves. Thus, use ..
instead of ...
:
yq --input-format xml '(.. | select(key == "member")) |= [] + .' data.xml
root:
heading:
- member:
- one
- member:
- two
- three