Search code examples
pythonmatplotlibtime-seriesaxis-labels

time series plot with strings in axis


I have a small time series of the form

                    created    current_page
0  2020-02-29T19:26:00.000Z        SOMMARIO
1  2020-02-29T19:25:00.000Z  DATI PERSONALI
2  2020-02-29T19:25:00.000Z    OFFERTA FULL
3  2020-02-29T19:24:00.000Z       DATI BENE
4  2020-02-29T19:23:00.000Z        HOMEPAGE

How could I plot it and get in the x-axis the dates in the "created" column and in the y-axis the "current page" elements? So far I have tried to use the label encoder for the y-axis values, but when I do that, also in the x-axis appear floating numbers instead of the dates. My code is

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
series["current_page"] = le.fit_transform(series["current_page"].values)
series.plot()
plt.gcf().autofmt_xdate()

and I get

enter image description here


Solution

  • Try this, I have checked on my local machine and it worked:

    # imports
    import pandas as pd
    import matplotlib.pyplot as plt
    %matplotlib inline
    
    # data
    created = ["2020-02-29T19:26:00.000Z", 
               "2020-02-29T19:25:00.000Z", 
               "2020-02-29T19:25:00.000Z", 
               "2020-02-29T19:24:00.000Z",
               "2020-02-29T19:23:00.000Z"]
    
    current_page = ["SOMMARIO", "DATI PERSONALI", "OFFERTA FULL", "DATI BENE", "HOMEPAGE"]
    
    # create df from data
    df = pd.DataFrame([created, current_page], ["created", "current_page"]).T
    
    # convert to datetime
    df["created"] = pd.to_datetime(df["created"])
    
    # enconde data
    from sklearn.preprocessing import LabelEncoder
    le = LabelEncoder()
    df["current_page"] = le.fit_transform(df["current_page"].values)
    
    # separte x and y
    x = df["created"]
    y = df["current_page"]
    
    # plot data
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.plot(x, y)