Search code examples
pythonpandasplotplotlyplotly-python

Plotly: How to make a jagged line plot look better?


I made a chart showing the number of items purchased over a period of time. The graph seems unreadable to me, hard to get the right perspective. My code below:

import numpy as np
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

trace1 = go.Scatter(x=df_temp['Date'],
                    y=df_temp['Quantity'],
                    line = dict(color = 'blue'),
                    opacity = 0.3)

layout = dict(title='Purchases of NC coin',)

fig = dict(data=[trace1], layout=layout)
iplot(fig)

And some of my data:

Id  Date                Quantity
8   2022-01-16 19:14:56 50814.040553
15  2022-01-12 09:18:01 2563.443420
17  2022-01-11 13:52:38 33055.752836
18  2022-01-11 11:49:54 6483.182959
19  2022-01-11 11:07:48 13005.174783
21  2022-01-11 10:50:20 19605.381370
23  2022-01-11 10:15:30 6561.223602
24  2022-01-11 10:14:44 19762.821100
28  2022-01-07 15:56:50 3307.607665
29  2022-01-07 15:54:30 66868.030051
30  2022-01-07 12:27:07 42683.069577
31  2022-01-07 12:20:51 3423.618394
34  2022-01-05 12:11:57 69607.963793
35  2022-01-05 10:41:48 20370.090947
37  2022-01-05 10:21:22 72415.914082
38  2022-01-05 10:05:04 20687.003754
39  2022-01-05 09:36:53 37410.532342
40  2022-01-05 08:35:06 43815.009603
41  2022-01-04 19:27:27 30581.795021
44  2022-01-03 16:34:41 14290.644375

My plot looks like this now: enter image description here

Do you have any ideas?


Solution

  • In my opinion, you've got three options:

    1. If no aggregation is desired, use a barplot with px.bar

    enter image description here

    2. Aggregate by day and use a line plot

    enter image description here

    3. Aggregate by day and use a bar plot

    enter image description here

    Since you're specifically asking for aesthetics, and not Plotly code, I'm going to use Plotly Express instead of iplot. You should too! If for some reason you can't, just let me know.

    Complete code:

    import pandas as pd
    import plotly.express as px
    import plotly.graph_objects as go
    
    df_temp = pd.DataFrame({'Id': {0: 8,
                              1: 15,
                              2: 17,
                              3: 18,
                              4: 19,
                              5: 21,
                              6: 23,
                              7: 24,
                              8: 28,
                              9: 29,
                              10: 30,
                              11: 31,
                              12: 34,
                              13: 35,
                              14: 37,
                              15: 38,
                              16: 39,
                              17: 40,
                              18: 41,
                              19: 44},
                             'Date': {0: '2022-01-16',
                              1: '2022-01-12',
                              2: '2022-01-11',
                              3: '2022-01-11',
                              4: '2022-01-11',
                              5: '2022-01-11',
                              6: '2022-01-11',
                              7: '2022-01-11',
                              8: '2022-01-07',
                              9: '2022-01-07',
                              10: '2022-01-07',
                              11: '2022-01-07',
                              12: '2022-01-05',
                              13: '2022-01-05',
                              14: '2022-01-05',
                              15: '2022-01-05',
                              16: '2022-01-05',
                              17: '2022-01-05',
                              18: '2022-01-04',
                              19: '2022-01-03'},
                             'Time': {0: '19:14:56',
                              1: '09:18:01',
                              2: '13:52:38',
                              3: '11:49:54',
                              4: '11:07:48',
                              5: '10:50:20',
                              6: '10:15:30',
                              7: '10:14:44',
                              8: '15:56:50',
                              9: '15:54:30',
                              10: '12:27:07',
                              11: '12:20:51',
                              12: '12:11:57',
                              13: '10:41:48',
                              14: '10:21:22',
                              15: '10:05:04',
                              16: '09:36:53',
                              17: '08:35:06',
                              18: '19:27:27',
                              19: '16:34:41'},
                             'Quantity': {0: 50814.040553,
                              1: 2563.44342,
                              2: 33055.752836,
                              3: 6483.182959,
                              4: 13005.174783,
                              5: 19605.38137,
                              6: 6561.223602,
                              7: 19762.8211,
                              8: 3307.607665,
                              9: 66868.030051,
                              10: 42683.069577,
                              11: 3423.618394,
                              12: 69607.963793,
                              13: 20370.090947,
                              14: 72415.914082,
                              15: 20687.003754,
                              16: 37410.532342,
                              17: 43815.009603,
                              18: 30581.795021,
                              19: 14290.644375}})
    trace1 = go.Scatter(x=df_temp['Date'],
                        y=df_temp['Quantity'],
                        line = dict(color = 'blue'),
                        opacity = 0.3)
    
    layout = dict(title='Purchases of NC coin',)
    
    # build pandas datetime series
    df_temp['DateTime'] = pd.to_datetime(df_temp.Date+' '+df_temp.Time)
    
    # # unaggregated barplot
    # fig = px.bar(df_temp, x = 'DateTime', y = 'Quantity')
    # fig.update_traces(marker_line_color = 'blue')
    # fig.update_layout(title='Purchases of NC coin')
    
    # aggregate by day
    df_temp = df_temp.groupby(by=[df_temp.DateTime.dt.date]).mean().reset_index()
    
    # # aggregated lineplot
    # fig = px.line(df_temp, x = 'DateTime', y = 'Quantity')
    # fig.update_traces(marker_line_color = 'blue')
    # fig.update_layout(title='Purchases of NC coin')
    
    # aggregated barplot
    fig = px.bar(df_temp, x = 'DateTime', y = 'Quantity')
    fig.update_traces(marker_line_color = 'blue')
    fig.update_layout(title='Purchases of NC coin')
    
    fig.show()