Search code examples
python-3.xjupyter-notebook3dplotly

Plotly not showing correct colors when using Jupyter Notebook


For my dataset, I wanted to change the colors of the "Labels" data that is shown in the 3d scatter plot, but I have been unsuccessful.

I keep getting these default colors:

enter image description here

This is the code that I am using:

import numpy as np
import os
import pandas as pd
from matplotlib import pyplot
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns

# Data
data = pd.read_csv('SamplePlotlyData.csv')
labels = data['Label'].values
data = data.drop(columns=['Label']).values

fig = px.scatter_3d(data,
            x= data[:,0], y= data[:,1], z = data[:,2], 
            labels={'x':'PCA-1', 'y':'PCA-2','z':'PCA-3'},
            color=labels,
            color_discrete_sequence=["blue", "goldenrod", "magenta"],
            title='3d Plot of Top 3 PCA components')
fig.show()

Can you assist me in correctly changing the color palette of the 3d scatter plot?

I am using jupyter notebook 6.0.3 with seaborn version 0.11.2

Here is my dataset:

36  37  38  39  Label
0.22717583  -0.1028256  -0.041157354    0.047657568 0
-1.242205   2.611936    1.5563084   -0.64137465 0
0.39261582  0.40208274  0.2835228   0.26541463  0
-4.296567   -1.3980201  -0.67690927 -0.941123   0
-1.5278594  1.103121    -1.4688232  -1.139884   0
2.35497 -1.3783572  0.4808609   -1.4851115  1
-0.055658106    -0.19007513 -0.40134305 -0.34722504 1
0.051404    -0.6016376  0.26404122  -0.42829922 1
-0.47935575 -0.049984064    0.67335206  0.123305336 1
0.57357675  0.9523434   -0.05714764 -0.6305638  1
0.1044371   1.2541072   0.1957058   0.083972946 2
0.47575372  0.18598396  0.069036044 0.63252586  2
-0.7613742  0.81920165  0.43508404  0.280004    2
-0.16776349 0.9296196   -1.1710609  0.86310846  2
-0.20844702 0.3536006   0.01729327  -0.28363776 2

Solution

  • One of the reasons you are seeing the colors is because your Label column is integer. Seaborn thinks it is numerical and uses continuous colors. So, you will need to change that to categorical using .astype(str). Also, I think you are moving Label to a labels and deleting the column, which is not required. So, I have updated it as below. Also attached the output plot.

    import numpy as np
    import os
    import pandas as pd
    from matplotlib import pyplot
    import matplotlib.pyplot as plt
    import plotly.express as px
    import seaborn as sns
    
    # Data
    data = pd.read_csv('SamplePlotlyData.csv')
    
    data['Label'] = data['Label'].astype(str) #Make it string
    #data = data.drop(columns=['Label']).values
    
    fig = px.scatter_3d(data,
                x= 36, y= 37, z = 38, #Updated here - just use column names
                labels={'x':'PCA-1', 'y':'PCA-2','z':'PCA-3'},
                color='Label', #Column name of Label
                color_discrete_sequence=["blue", "goldenrod", "magenta"],
                title='3d Plot of Top 3 PCA components')
    fig.show()
    

    Output

    enter image description here