Search code examples
yamlyq

going back to 1 document after `.[]` in yq


Case 1

With input

- groupx:
- groupy:

The yq expression length evals to 2

Case 2

With input

groupx:
groupy:

The yq expression length also evals to 2

Case 3 - using splat (.[]) to transform case1 to case2

First we transform the input of case 1 to case 2 by doing:

$ yq '.[]' case1.yaml

groupx:
groupy:

We note that this is the same output as case2.yaml

If I want to extend this expression with the expression of case2, I do not get the same output, instead I get the following:

$ yq '.[] | length' case1.yaml
1
1

If I add a second yq process, I do get the same result as in case2:

$ yq '.[]' case1.yaml | yq 'length'
2

Question

Why is yq's splat operator making the result effectively a yaml multi-document? This does not make sense as the same output fed as input to yq again, does not result in a multi-document.

If this is intended behaviour, can you return to a single document after having used .[] without starting a new yq process?


Solution

  • Case 1 is a single document with a seq (array) that contains two maps (objects), each having exactly one key (field).

    Case 2 is a single document with a map (object) that contains two keys (fields).

    Applying .[] to either case destructures the document's top-level type (the seq in the first case, the map in the second), and results (for both cases) in two documents (the first having two maps, the second two (null) values).

    Thus, applying length to that result in the same filter always yields two numbers (both 1 in the first case as you have two maps containing one key each, and both 0 in the second as you have two scalar values that do not contain anything).

    However, applying length to that result in another filter re-evaluates the input. Now, in the first case, the two single-item maps collapse into one having two items, hence the overall result of 2. In the second case, the scalar values cannot be collapsed further, so the results remains the same (two numbers, both 0).

    I cannot speak for the developer whether this is intended or not. But I can assert that this deviation only applies to mikefarah/yq, not to kislyuk/yq which keeps treating the input as two documents because it actively prints a document separator. (Although mikefarah/yq has a --no-doc (or -N) option to "not print document separators", but this seems to be applied by default -- tested with versions v4.34.2 and v4.40.5).

    If this is intended behaviour, can you return to a single document after having used .[] without starting a new yq process?

    Sure. You can collect the items after destructuring into a single item, but its implementation depends on what type it should have.

    If you want it to be a seq of the resulting documents, wrap your filter into brackets to collect the items into it. So .[] | … | length would become [.[] | …] | length and should yield 2. Actually, you can shortcut [.[] | …] using map(…).

    If you want it to be a map (just like it collapses with two invocations), you could use ireduce to iteratively append all items to an initially empty map: (.[] | …) as $i ireduce ({}; . + $i) | length. (On a sidenote: With kislyuk/yq, you have the add filter which can collapse a seq of maps into a single map.)