Understanding the interaction between mark_line point overlay and legend

I have found some unintuitive behavior in the interaction between the point property of mark_line and the appearance of the color legend for Altair/Vega-Lite. I ran into this when attempting to create a line with very large and mostly-transparent points in order to increase the area that would trigger the line's tooltip, but was unable to preserve a visible type=gradient legend.

The following code is an MRE for this problem, showing 6 cases: the use of [False, True, and a custom OverlayMarkDef] for the point property and the use of plain and customized color encoding.

import pandas as pd
import altair as alt

# create data
df = pd.DataFrame()
df['x_data'] = [0, 1, 2] * 3
df['y2'] = [0] * 3 + [1] * 3 + [2] * 3

# initialize
base = alt.Chart(df)
markdef = alt.OverlayMarkDef(size=1000, opacity=.001)
color_encode = alt.Color(shorthand='y2', legend=alt.Legend(title='custom legend', type='gradient'))
marks = [False, True, markdef]
encodes = ['y2', color_encode]

plots = []
for i, m in enumerate(marks):
    for j, c in enumerate(encodes):
        plot = base.mark_line(point=m).\
            encode(x='x_data', y='y2', color=c, tooltip=['x_data','y2']).\
            properties(title=', '.join([['False', 'True', 'markdef'][i], ['plain encoding', 'custom encoding'][j]]))
        plots.append(plot)
combined = alt.vconcat(
    alt.hconcat(*plots[:2]).resolve_scale(color='independent'),
    alt.hconcat(*plots[2:4]).resolve_scale(color='independent'),
    alt.hconcat(*plots[4:]).resolve_scale(color='independent')
).resolve_scale(color='independent')

The resulting plot (the interactive tooltips work as expected):

The color data is the same for each of these plots, and yet the color legend is all over the place. In my real case, the gradient is preferred (the data is quantitative and continuous).

With no point on the mark_line, the legend is correct.
Adding point=True converts the legend to a symbol type - I'm not sure why this is the case since the default legend type is gradient for quantitative data (as seen in the first row) and this is the same data - but can be forced back to gradient by the custom encoding.
Attempting to make a custom point via OverlayMarkDef however renders the forced gradient colorbar invisible - matching the opacity of the OverlayMarkDef. But it is not simply a matter of the legend always inheriting the properties of the point, because the symbol legend does not attempt to reflect the opacity.

I would like to have the normal gradient colorbar available for the custom OverlayMarkDef, but I would also love to build up some intuition for what is going on here.

Solution

The transparency issue with the bottom right plot has been fixed since Altair 4.2.0, so now all occasions that include a point on the line changes the legend to 'Ordinal' instead of 'Quantitative'.

I believe the reason the legend is converted to a symbol instead of a gradient, is that your are adding filled points and the fill channel is not set to a quantitative field so it defaults to either ordinal or nominal with a sort:

plot = base.mark_line().encode(
    x='x_data',
    y='y2',
    color='y2',
)
plot + plot.mark_circle(opacity=1)

mark_point gives a gradient legend since it has not fill, and if we set the fill for mark_circle explicitly we also get a gradient legend (one for fill and one for color.

plot = base.mark_line().encode(
    x='x_data',
    y='y2',
    color='y2',
    fill='y2'
)
plot + plot.mark_circle(opacity=1)

I agree with you that this is a bit unexpected and it would be more convenient if the encoding type of point=True was set to the same as that used for the lines. You might suggest this as an enhancement in VegaLite together with reporting the apparent bug that you can't override the legend type via type='gradient'.