When to nest mark property in Layer versus Top-Level Vega-lite spec?

I am wondering how Vega-lite works with respect to tying Marks to associated Encodings.

In the below example, both the encoding and the mark are at the "top-level" of the spec:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "description": "A simple bar chart with embedded data.",
  "data": {
    "values": [
      {"a": "A", "b": 28}, {"a": "B", "b": 55}, {"a": "C", "b": 43},
      {"a": "D", "b": 91}, {"a": "E", "b": 81}, {"a": "F", "b": 53},
      {"a": "G", "b": 19}, {"a": "H", "b": 87}, {"a": "I", "b": 52}
    ]
  },
  "mark": "bar",
  "encoding": {
    "x": {"field": "a", "type": "nominal", "axis": {"labelAngle": 0}},
    "y": {"field": "b", "type": "quantitative"}
  }
}

And with the simplest layer example, both the bar mark and the text mark are nested in the Layer property

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "description": "Bar chart with text labels. Apply scale padding to make the frame cover the labels.",
  "data": {
    "values": [
      {"a": "A", "b": 28},
      {"a": "B", "b": 55},
      {"a": "C", "b": 43}
    ]
  },
  "encoding": {
    "y": {"field": "a", "type": "nominal"},
    "x": {"field": "b", "type": "quantitative", "scale": {"padding": 10}}
  },
  "layer": [{
    "mark": "bar"
  }, {
    "mark": {
      "type": "text",
      "align": "left",
      "baseline": "middle",
      "dx": 3
    },
    "encoding": {
      "text": {"field": "b", "type": "quantitative"}
    }
  }]
}

In this case, am I correct to assume that any Mark in the Layer property automatically inherits the encodings at the top-level?
Further, I notice that I cannot move the bar Mark outside of the Layer property (Vega Editor prompts that this is not an allowed property and bars fail to render if placed in top-level).
Finally, in more complicated example still (see: https://vega.github.io/vega-lite/examples/layer_line_mean_point_raw.html), the encodings are repeated in the layer (despite having a redundant x encoding) -> so in this case, when is it appropriate to place encoding at the top level versus in the layer?

The Vega-lite docs go into a fair bit of detail about the configuration of these properties but I have not been able to find a conceptual answer to these 3 questions.

Thank you

Solution

Vega-Lite provides a hierarchical chart model, where each level in the hierarchy can override various properties declared in the parent level. In terms of layer specifications, the relevant concepts are this:

a UnitSpec is what you think of as a single chart: it it, you can specify data, mark, encodings, transforms, and other properties.
a LayerSpec, is a container that can hold a number of UnitSpec or LayerSpec specifications in the layers property. Additionally, you can specify data, encodings transforms, and other properties (but not mark).

A UnitSpec that is within a LayerSpec or other top-level object will inherit any properties specified there (such as data, encodings, transforms, etc.), and is also able to override them by specifying its own data, encodings, or transforms.

Similar hierarchical concepts apply to other compound chart types, such as ConcatSpec, VConcatSpec, HConcatSpec, FacetSpec, etc.

More concretely, in your example, the data and some encodings are defined in the top-level layer:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "description": "Bar chart with text labels. Apply scale padding to make the frame cover the labels.",
  "data": {
    "values": [
      {"a": "A", "b": 28},
      {"a": "B", "b": 55},
      {"a": "C", "b": 43}
    ]
  },
  "encoding": {
    "y": {"field": "a", "type": "nominal"},
    "x": {"field": "b", "type": "quantitative", "scale": {"padding": 10}}
  },
  "layer": [{
    "mark": "bar"
  }, {
    "mark": {
      "type": "text",
      "align": "left",
      "baseline": "middle",
      "dx": 3
    },
    "encoding": {
      "text": {"field": "b", "type": "quantitative"}
    }
  }]
}

In terms of the inheritance from parent, this is functionally equivalent to the following, where I have moved data and encodings from the top-level into each contained UnitSpec:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "description": "Bar chart with text labels. Apply scale padding to make the frame cover the labels.",
  "layer": [{
    "data": {
      "values": [
        {"a": "A", "b": 28},
        {"a": "B", "b": 55},
        {"a": "C", "b": 43}
      ]
    },
    "mark": "bar"
    "encoding": {
      "y": {"field": "a", "type": "nominal"},
      "x": {"field": "b", "type": "quantitative", "scale": {"padding": 10}}
    },
  }, {
    "data": {
      "values": [
        {"a": "A", "b": 28},
        {"a": "B", "b": 55},
        {"a": "C", "b": 43}
      ]
    },
    "mark": {
      "type": "text",
      "align": "left",
      "baseline": "middle",
      "dx": 3
    },
    "encoding": {
      "y": {"field": "a", "type": "nominal"},
      "x": {"field": "b", "type": "quantitative", "scale": {"padding": 10}}
      "text": {"field": "b", "type": "quantitative"}
    }
  }]
}

Specifying shared properties at the top level is a way to make chart specifications more concise and understandable.