I have the following YAML document:
222:
description:
en: "124098-en"
fr: "498438-fr"
name:
en: "293878-en"
fr: "222493878-fr"
mass: 0.1
groupID: "24902"
223:
description:
en: "124098-en"
fr: "498438-fr"
name:
en: "zz325-en"
fr: "222493878-fr"
mass: 0.1
groupID: "234988"
[many other records]
I would like to construct a CSV that looks like:
222,"293878-en","24902"
223,"zz325-en","234988"
That is, each row is just:
.[].name.en
from the original document.[].groupID
from the original documentNo other fields are preserved in the CSV from the original document.
What's the right way to do this?
Addendum: I'm using the Go version of yq (4.7.1) but either the Go or the Python version is fine, or if that's not the right tool here, I'm happy to use something else.
The Python yq
version is much more straightforward to use, because it literally uses jq
under the hood to operate on the JSON converted from the YAML.
You can use the jq
's constructs and get the CSV result as
yq -r 'keys_unsorted[] as $k | [ ($k|tonumber), (.[$k] | .name.en, .groupID) ] | @csv' yaml
The @csv
function puts the elements collected in the array to the native type as originally encoded in the source. If groupID
is intended to be stored as a string, it could be done as .groupID | tostring
Go yq
was much unique prior to v4, when it used its own DSL, but now as of v4.8
its trying so hard to implement the functions of jq
. It doesn't have a CSV function out-of-the-box yet.