Search code examples
pythonplotlyplotly-python

How can I combine all the data inside a month into a single column representing one month in Plotly?


I am creating a graph using Plotly to show how much a company spent each month according to the date of spending and the value. The code I'm using is below. (I entered example data)

dates = ["2020-01-02","2020-01-14","2020-07-29","2020-12-12","2020-11-08","2020-18-19"]
values = [5, 6, 15, 22, 24, 8]

data = [
            go.Bar(
                x=dates,
                y=values,
                name='Revenue',
                marker={'color': '#3FC1C9'}
            )
        ]

layout = go.Layout(
    title='Values',
    xaxis={'title': 'Month'},
    yaxis={'title': 'Total'}
)
fig = go.Figure(data=data, layout=layout)
fig.show()

It produces this graph: enter image description here

It puts the values according to the exact date, which causes them to be spread out even if the month is the same. What I want is to find the total value for each month and treat the whole month as one value on the x-axis. Therefore my goal is to do something like this: enter image description here

Can I do that in any way?


Solution

  • If you want to draw a month-by-month graph, you first need to organize your data into months. Convert your data into a data frame and add a column for years and months. Create a data frame with the data grouped and aggregated by those added years. We made a graph with that data. By default, the x-axis is in units of two months, so I modified it to be in units of one month.

    import pandas as pd
    import plotly.graph_objects as go
    
    dates = ["2020-01-02","2020-01-14","2020-07-29","2020-12-12","2020-11-08","2020-12-19"]
    values = [5, 6, 15, 22, 24, 8]
    
    df = pd.DataFrame({'dates':dates,'values':values})
    df['dates'] = pd.to_datetime(df['dates'])
    df['year_month'] = df['dates'].apply(lambda x: str(x.year) + '-' + str(x.month))
    df['year_month'] = pd.to_datetime(df['year_month'], format='%Y-%m')
    df = df.groupby(df['year_month'])['values'].sum()
    
    year_month values
    2020-01-01 00:00:00 11
    2020-07-01 00:00:00 15
    2020-11-01 00:00:00 24
    2020-12-01 00:00:00 30
    data = [
        go.Bar(
            x=df.index,
            y=df.values,
            name='Revenue',
            marker={'color': '#3FC1C9'}
        )
    ]
    
    layout = go.Layout(
        title='Values',
        xaxis={'title': 'Month','tick0':df.index[0], 'dtick':'M1'},
        yaxis={'title': 'Total'}
    )
    
    fig = go.Figure(data=data, layout=layout)
    fig.show()
    

    enter image description here