How does plotly plot a scatter plot? What is the logic behind it? If I have a NumPy array-like
import numpy as np
arr = np.array([[1,2,3,1,2,3,4],[1,1,1,1,1,-6,4],[0,0,0,0,0,0,4],[-3,-2,-1,1,2,3,4],[1,1,15,1,2,-999,4]])
and I am using the following code to plot the array:
import plotly.express as px
fig = px.scatter(arr, len(arr), width=1000, height=500)
fig.update_layout(xaxis_title='array value', yaxis_title='index')
fig.show()
it has the following values for each point(from top to bottom)
(-999, index = 4) (3, index =3) (3, index =3) (0, index =2) (-6 index =1) (3, index =0)
I can understand the value of the index (or y-axis coordinates ) as they are the array's index going from 0 to 4 due to its length being 5. But how is the x coordinate defined? Please can someone tell me this?
Plotly converts the NumPy array to a Pandas Dataframe. So your first argument becomes a dataframe and your second argument is the 'x' kwarg or the name of the x column. len(arr) was 5 so you picked column '5'. If you leave out the 'y' kwarg, it will default to the index of the dataframe.
So your code is equivalent to this:
import plotly.express as px
import pandas as pd
import numpy as np
arr = np.array([[1,2,3,1,2,3,4],[1,1,1,1,1,-6,4],[0,0,0,0,0,0,4],[-3,-2,-1,1,2,3,4],[1,1,15,1,2,-999,4]])
df = pd.DataFrame(arr)
df
fig = px.scatter(df, x=5, y=None, width=1000, height=500)
fig.update_layout(xaxis_title='Column: 5')
fig.show()
I recommend you use Pandas with Plotly Express. Maybe do something like this instead.
# name your columns
df.columns = ['var_a','var_b','speed', 'time','whatever','example','lastcolumn']
# specify the columns to plot on the x and y and any other kwargs you want.
fig = px.scatter(df, x='var_a', y='example', size='whatever', width=1000, height=500)
fig.show()