I have a pandas frame with distance matrix, I use PCA to do the dim reduction. The the dataframe of this distance matrix has label for each point, and size.
How can I make each scattered point become a circle with a size dependent on the size from the dataframe
````
pca = PCA(n_components=2)
pca.fit(dist)
mds5 = pca.components_
fig = go.Figure()
fig.add_scatter(x = mds5[0],
y = mds5[1],
mode = 'markers+text',
marker= dict(size = 8,
color= 'blue'
),
text= dist.columns.values,
textposition='top right')
````
I need to have the scatter plot looks something like this example, however, when I add the size for each point in related answers, I cant get the circles to overlap, and when they do, I can zoom in, then they dont overlap anymore
sounds strange, but I need to create a logic, that if two circles overlap, the one with smaller radius will dissapear, so:
I'm still not sure which PCA parameter you want to be reflected in the circle size, but: either you want to
ax.scatter()
) whose size=
is reflecting your chosen PCA parameter; this size will (and should not) rescale when you rescale the figure; it is also not given in (x,y)-unitsplt.Circle((x,y), radius=radius, **kwargs)
patches, whose radii are given in (x,y)-units; the point overlap is then consistent on rescale, but this will likely cause deformed pointsThe following animation will emphasise the issue at hand:
I suppose you want the plt.Circle
-based solution, as it keeps the distance static, and then you need to "manually" calculate beforehand whether two points overlap and delete them "manually". You should be able to do this automatically via a comparison between point size (i.e. radius
, your PCA parameter) and the euclidian distance between your data points (i.e. np.sqrt(dx**2 + dy**2)
).
To use Circles, you could e.g. define a shorthand function:
def my_circle_scatter(ax, x_array, y_array, radius=0.5, **kwargs):
for x, y in zip(x_array, y_array):
circle = plt.Circle((x,y), radius=radius, **kwargs)
ax.add_patch(circle)
return True
and then call it with optional parameters (i.e. the x- and y-coordinates, colors, and so on):
my_circle_scatter(ax, xs, ys, radius=0.2, alpha=.5, color='b')
Where I've used fig,ax=plt.subplots()
to create the figure and subplot individually.