A simple call to plotly's figure_factory routine to create a scatter matrix:
import pandas as pd
import numpy as np
from plotly import figure_factory
df = pd.DataFrame(np.random.randn(40,3))
fig = figure_factory.create_scatterplotmatrix(df, diag='histogram')
fig.show()
yields
My questions are:
df.corr()
) in the upper right corner of the non-diagonal plots?To change to the same color for the first, update the marker attribute color in the generated graph data; to modify the range of axes for the second scatter plot, update the generated data in the same way; since only the x-axis has been modified, use the same technique for the y-axis if necessary; to change to a normalized version of the third histogram To change to the normalized version of the third histogram, replace it with the normalized data. The data to be replaced is the one done in the example specification in Ref. If this does not hit normalization, I believe it is possible to replace it with data obtained with np.histogram(), etc. The fourth is a note, but I have added the data obtained with df.corr() with the graph data reference, specifying the data by axis name for each subplot.
import pandas as pd
import numpy as np
from plotly import figure_factory
np.random.seed(20220529)
df = pd.DataFrame(np.random.randn(40,3))
density = px.histogram(df, x=[0,1,2], histnorm='probability density')
df_corr = df.corr()
fig = figure_factory.create_scatterplotmatrix(df, diag='histogram', height=600, width=600)
# 1.How can I specify a single color for all the plots?
for i in range(9):
fig.data[i]['marker']['color'] = 'blue'
# 2.How can I set the axes ranges for each of the three variables on the scatter plot?
for axes in ['xaxis2','xaxis3','xaxis4','xaxis6','xaxis7']:
fig.layout[axes]['range']=(-4,4)
# 3.Is there a way to create a density (normalized) version of the histogram?
fig['data'][0]['histnorm'] = 'probability density'
fig['data'][4]['histnorm'] = 'probability density'
fig['data'][8]['histnorm'] = 'probability density'
# 4.Is there a way to include the correlation coefficient (say, computed from df.corr())
# in the upper right corner of the non-diagonal plots?
for r,x,y in zip(df_corr.values.flatten(),
['x1','x2','x3','x4','x5','x6','x7','x8','x9'],
['y1','y2','y3','y4','y5','y6','y7','y8','y9']):
if r == 1.0:
pass
else:
fig.add_annotation(x=3.3, y=2, xref=x, yref=y, showarrow=False, text='R:'+str(round(r,2)))
fig.show()