I am making a pca bi-plot in python with Bokeh. I want to make to add text into the figure so that it more or less looks like in the image (ignoring the red lines):
). However I can not figure out how to do this in Bokeh.
If I try to make one data source like this:
data = {'PC1': x_scale*xs, 'PC2': y_scale*ys , 'cluster': [str(x) for x in comments_df.kmeans_10],
'C1': components[0,:], 'C2': components[1,:], 'words':wordlist}
source = ColumnDataSource(data)
It starts complaining that I have different column lengths (which I am aware of). And it will not create a plot
BokehUserWarning: ColumnDataSource's columns must be of the same length. Current lengths: ('C1', 5775), ('C2', 5775), ('PC1', 16555), ('PC2', 16555), ('cluster', 16555), ('words', 5775)
My other thought was to create two datasources:
data = {'PC1': x_scale*xs, 'PC2': y_scale*ys , 'cluster': [str(x) for x in comments_df.kmeans_10]}
source = ColumnDataSource(data)
data2 = {'C1': components[0,:], 'C2': components[1,:], 'words':wordlist}
source2 = ColumnDataSource(data2)
However, In this case I get an error in plotting. My code:
plot = Plot(plot_width=600, plot_height=600, tools = [PanTool(), WheelZoomTool(), BoxZoomTool(), ResetTool()])
## PCA loadings
glyph = Circle(x="PC1", y="PC2", fill_color=cmap)
plot.add_glyph(source, glyph)
xaxis = LinearAxis()
plot.add_layout(xaxis, 'below')
plot.xaxis.axis_label = 'PC1'
yaxis = LinearAxis()
plot.add_layout(yaxis, 'left')
plot.yaxis.axis_label = 'PC2'
plot.add_layout(Grid(dimension=0, ticker=xaxis.ticker))
plot.add_layout(Grid(dimension=1, ticker=yaxis.ticker))
## PCA components
word_glyph = Text(x="C1", y="C2", text="words", text_color="red")
plot.add_glyph(source2, glyph)
# Show
show(plot)
The error:
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name. This could either be due to a misspelling or typo, or due to an expected column being missing. : key "fill_color" value "cluster", key "x" value "PC1" (closest match: "C1"), key "y" value "PC2" (closest match: "C2") [renderer: GlyphRenderer(id='2028', ...)]
I am probably overlooking something or searching with incorrect terms, so if anyone could tell me what to do that would be great. Thanks!
You are configuring the first glyph twice, the second time with the wrong data source:
plot.add_glyph(source2, glyph)
Presumably you intend:
plot.add_glyph(source2, word_glyph)
As an aside, you are using the very low level bokeh.models
API, which is very verbose. There is a higher level bokeh.plotting
API that is generally always much less code for the equivalent output, and also make lots or errors like this impossible (since it coordinates the glyphs and data sources for you).