Search code examples
pythonmatplotlibgraphreal-timeseries

How to plot a real-time graph from a pandas series? And reading intermittently from a file


I have a file which I loaded in as a panda series. However, the file is to big to load on a single node machine and I'd like to read the file a few lines at a time while updating these values on a graph.

A sample of the file data is shown below:

ip,date,time,zone,cik,accession,extention,code,size,idx,norefer,noagent,find,crawler,browser
101.81.76.dii,2016-03-31,00:00:00,0.0,1283497.0,0001209191-16-111028,-index.htm,200.0,14926.0,1.0,0.0,0.0,10.0,0.0,
104.40.128.jig,2016-03-31,00:00:00,0.0,1094392.0,0001407682-16-000270,.txt,200.0,5161.0,0.0,0.0,0.0,10.0,0.0,

A sample of my code is shown below:

data = pd.read_csv('filepath')
data2 = data[['ip','time','date','size']]
data2['size/MB']= data2['size']/1024
data3 = data2[['ip','time','date','size/MB']]
gr = data3.groupby(['date','time']).sum()
GB = gr['size/GB']= gr['size/MB']/1024

columns = ["size/MB"]
df=GB[0:0]
"""plt.ion()"""
plt.figure()
i=10
while i<len(GB):
    df = df.append(GB[0:i])
    ax = df.plot(secondary_y=['prex'])
    plt.show()
    tm.sleep(0.5)
    i+=10

This, however, creates multiple windows. I tried to used plt.draw() function in place of plt.show() but it doesn't work. Thanks


Solution

  • (1) If you want to plot to the same axes, instead of a new figure, you need to provide an existing matplotlib axes ax object to the dataframe's plot method

    DataFrame.plot(..., ax=ax)
    

    (2) Calling plt.show() opens a window, which takes over the event loop. The remaining script will stop until you close this window. Calling it inside a loop therefore needs to be avoided. plt.draw() is appropriate to draw inside a loop in interactive mode. Therefore plt.ion() must be called before the loop.

    (3) Using time.sleep() is a bad idea when working with GUI elements like the matplotlib plotting window. It literally lets the application sleep, resulting in an unresponsive window. Use plt.pause() instead.

    (4) You need to specify which data to plot in the plot method of the dataframe. Also you need to clear the axes, otherwise the old plots will stay in the plot.

    Now, here is a working script, which animates a dataframe.

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd
    
    x = np.arange(100)
    y = np.random.rand(100)
    df = pd.DataFrame({"x":x, "y":y})
    df2 = df[0:0]
    
    plt.ion()
    fig, ax = plt.subplots()
    i=0
    while i < len(df):
        df2 = df2.append(df[i:i+1])
        ax.clear()
        df2.plot(x="x", y="y", ax=ax)
        plt.draw()
        plt.pause(0.2)
        i+=1
    plt.show()
    

    This is not the most efficient method of animating matplotlib graphs, but it's close to your code.