Search code examples
pythonseabornscatter-plot

Seaborn Scatterplot is using multiple different markers instead of dots


I am running seaborn 0.13.2. I am trying to make a scatterplot with only dots. When i try sns.scatterplot(end_of_year_2018) it shows all kind of weird symbols, from triangles to +,* etc. etc. (see picture below)

So I tried to make a scatterplot like this:

end_of_year_2018 = yf.download ([     "ABBN.SW", "ADEN.SW", "CFR.SW", "SGSN.SW",     "HOLN.SW", "NESN.SW", "NOVN.SW",      "ROG.SW", "SREN.SW", "SCMN.SW",      "UHR.SW", "UBSG.SW", "ZURN.SW" ], start = '2018-12-28', end = '2018-12-29')['Adj Close']

sns.scatterplot(end_of_year_2018, marker='o')

but still all of these symbols are present.

Everything else works just fine.

The plot

I already tried updating me Seaborn: pip install --upgrade seaborn I tried resetting seaborn: sns.set(rc=None)

The rest of the plot works just fine, what should I do?


Solution

    • See this answer for the correct way to pass positional arguments to the seaborn plotting API.
    • Dataframes passed to seaborn should be tidy (long-form), not a wide-form, which can be implemented with pandas.DataFrame.melt, as shown in the answers to this question.
      • Seaborn’s preferred approach is to work with data in a 'long-form' or 'tidy' format. This means that each variable is a column and each observation is a row. This format is more flexible because it makes it easier to subset the data and create complex visualizations.
    • General use cases:
      • Line plot: continuous data, such as a value against dates
      • Bar chart: categorical data, such as a value against a category
      • Scatter plot: bivariate plots showing the relationship between two variables measured on a single sample of subjects
    • Tested in python v3.12.0, pandas v2.2.1, matplotlib v3.8.1, seaborn v0.13.2.
    import yfinance as yf
    import seaborn as sns
    import matplotlib.pyplot as plt
    
    # download the data
    end_of_year_2018 = yf.download(['ABBN.SW', 'ADEN.SW', 'CFR.SW', 'SGSN.SW', 'HOLN.SW', 'NESN.SW', 'NOVN.SW', 'ROG.SW', 'SREN.SW', 'SCMN.SW', 'UHR.SW', 'UBSG.SW', 'ZURN.SW'], start='2018-12-28', end='2023-12-29')['Adj Close']
    
    # convert the data to a long form
    end_of_year_2018_long = end_of_year_2018.melt(ignore_index=False).reset_index()
    
    # plot
    plt.figure(figsize=(10, 8))
    ax = sns.scatterplot(data=end_of_year_2018_long, x='Date', y='value', hue='Ticker', marker='.')
    sns.move_legend(ax, bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)
    

    enter image description here

    • Optionally, use sns.relplot, which removes the need to separately set the figure size and legend position.
    g = sns.relplot(data=end_of_year_2018_long, x='Date', y='value', hue='Ticker', marker='.', height=7)
    

    enter image description here


    Notes

    • If many dates are being compare, a line chart, not a scatter plot, should be used.
    g = sns.relplot(kind='line', data=end_of_year_2018_long, x='Date', y='value', hue='Ticker', aspect=3)
    

    enter image description here

    • To compare the values for a single date, as suggested by the use of start='2018-12-28', end='2018-12-29', then a bar chart, not a scatter plot, should be used.
    end_of_year_2018 = yf.download(['ABBN.SW', 'ADEN.SW', 'CFR.SW', 'SGSN.SW', 'HOLN.SW', 'NESN.SW', 'NOVN.SW', 'ROG.SW', 'SREN.SW', 'SCMN.SW', 'UHR.SW', 'UBSG.SW', 'ZURN.SW'], start='2018-12-28', end='2018-12-29')['Adj Close']
    
    end_of_year_2018_long = end_of_year_2018.melt(ignore_index=False).reset_index()
    
    g = sns.catplot(kind='bar', data=end_of_year_2018_long, x='Ticker', y='value', aspect=3)
    _ = g.fig.suptitle('Adjusted Close for 2018-12-28')
    

    enter image description here


    end_of_year_2018.head()

    Ticker        ABBN.SW    ADEN.SW     CFR.SW    HOLN.SW    NESN.SW    NOVN.SW      ROG.SW     SCMN.SW    SGSN.SW    SREN.SW    UBSG.SW      UHR.SW     ZURN.SW
    Date                                                                                                                                                         
    2018-12-28  15.066603  34.481403  58.147148  32.570705  70.373329  55.331772  208.348633  381.062653  74.905602  64.147781  10.002420  254.402374  222.631409
    2019-01-03  14.885272  32.702148  56.541176  31.991671  71.643211  55.305443  213.356201  389.256622  75.007278  64.133545  10.010596  245.528900  222.555435
    2019-01-04  15.260020  34.211132  58.664013  33.704643  72.577995  55.845329  214.854187  389.500031  76.939247  65.130074  10.317168  254.047409  225.745667
    2019-01-07  15.151222  34.714130  59.033199  33.350792  71.537399  54.712891  212.200623  389.743408  77.007027  64.930763  10.337606  254.047409  223.846725
    2019-01-08  15.360760  35.795193  60.048466  33.817234  71.872498  55.753159  216.138184  386.336060  77.515434  65.001945  10.398920  260.081360  226.505249
    

    end_of_year_2018_long.head()

            Date   Ticker      value
    0 2018-12-28  ABBN.SW  15.066603
    1 2019-01-03  ABBN.SW  14.885272
    2 2019-01-04  ABBN.SW  15.260020
    3 2019-01-07  ABBN.SW  15.151222
    4 2019-01-08  ABBN.SW  15.360760
    

    end_of_year_2018

    • With start='2018-12-28', end='2018-12-29', which is only one day of data.
    Ticker        ABBN.SW  ADEN.SW     CFR.SW    HOLN.SW    NESN.SW    NOVN.SW      ROG.SW     SCMN.SW    SGSN.SW    SREN.SW    UBSG.SW      UHR.SW     ZURN.SW
    Date                                                                                                                                                       
    2018-12-28  15.066599  34.4814  58.147148  32.570705  70.373329  55.331787  200.049286  381.062653  74.905609  64.147789  10.002419  254.402344  222.631409
    

    • As noted by @JohanC, markers=['.']*len(end_of_year_2018.columns) can work.
    • markers=False results in a bug.
    ax = sns.scatterplot(data=end_of_year_2018, markers=['.']*len(end_of_year_2018.columns))