Search code examples
python-3.xpandasmatplotlibscikit-learnpca

How to annotated labels to a 3D matplotlib scatter plot?


I have run a sklearn - Principal Component Analysis on my data with 3 principal components (PC1, PC2, PC3). The data looks like this (it's a pandas DataFrame): enter image description here

Here is the code for plotting the principle components:

from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
%matplotlib
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.set_title('3D Scatter Plot')
ax.set_xlabel('PC1')
ax.set_ylabel('PC2')
ax.set_zlabel('PC3')

ax.view_init(elev=12, azim=40)              # elevation and angle
ax.dist=10                                 # distance
ax.scatter(
       data_df_3dx['PC1'], data_df_3dx['PC2'], data_df_3dx['PC3'],  # data
       #color='purple',                            # marker colour
       #marker='o',                                # marker shape
       s=60                                       # marker size
       )

plt.show() 

My problem, how do I add labels (like 'GER, medium') to the points? Hope someone can help me :)


Solution

  • One way would be to plot each point individually inside of a for loop, that way you know the coordinates of each point and can add text to it.

    for i in range(len(data_df_3dx)):
        x, y, z = data_df_3dx.iloc[i]['PC1'], data_df_3dx.iloc[i]['PC2'], data_df_3dx.iloc[i]['PC3']
        ax.scatter(x, y, z)
        #now that you have the coordinates you can apply whatever text you need. I'm 
        #assuming you want the index, but you could also pass a column name if needed
        ax.text(x, y, z, '{0}'.format(data_df_3dx.index[i]), size=5)