Search code examples
pythonpandasmatplotlibscatter

How to create a scatter plot with timestamps


How to produce a scatter plot using timestamps?

Below is an example but I get an error ValueError: First argument must be a sequence

import pandas as pd
import matplotlib.pyplot as plt

d = ({
    'A' : ['08:00:00','08:10:00'],
    'B' : ['1','2'],           
    })

df = pd.DataFrame(data=d)

fig = plt.figure()

x = df['A']
y = df['B']

plt.scatter(x,y)

If I convert the timestamps to total seconds it works:

df['A'] = pd.to_timedelta(df['A'], errors="coerce").dt.total_seconds()

But I'd like 24hr time to be on the x-axis, rather than total seconds.


Solution

  • Use x-ticks:

    d = ({
    'A' : ['08:00:00','08:10:00'],
    'B' : [1,2],
    })
    
    df = pd.DataFrame(data=d)
    
    fig = plt.figure()
    
    x = df['A']
    y = df['B']
    
    x_numbers = list(pd.to_timedelta(df['A'], errors="coerce").dt.total_seconds ())
    plt.scatter(x_numbers, y)
    plt.xticks(x_numbers, x)
    plt.show()
    

    This will plot the timestamps as the tick marks on the x-axis.

    Note that I converted the 'B' column in your dataframe to numbers instead of strings. If they really need to be strings or are imported as strings from a file, just convert them to numbers, e.g. via float() or int().