Search code examples
pythonmatplotlib-animation

Plotting a Baseball Field Animation


I have a dataframe that includes position data from all 9 players on a baseball field including the hitter throughout a given play as well as the ball trajectory. I need some help with figuring out possibly why my animation is not working. The code below plots an instance of the plot, but it doesn't show a continuous animation. In other words, it should show dots moving continuously. Here is my code:

import pandas as pd
from sportypy.surfaces import MiLBField
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
import numpy as np

# The dimensions are not exactly like this but this is an example if you need something to go off of

num_rows = 50

data = {
    'game_str': ['game_01'] * num_rows,
    'play_id': [10] * num_rows,
    'timestamp': np.random.randint(180000, 181000, size=num_rows),
    'player_position': np.random.randint(1, 11, size=num_rows),
    'field_x': np.random.uniform(-150, 150, size=num_rows),
    'field_y': np.random.uniform(-150, 150, size=num_rows),
    'ball_position_x': np.random.uniform(0.0, 2.0, size=num_rows),
    'ball_position_y': np.random.uniform(0.0, 300.0, size=num_rows),
    'ball_position_z': np.random.uniform(0.0, 10.0, size=num_rows)
}


df = pd.DataFrame(data).sort_values(by='timestamp')


field = MiLBField()

def update(frame):
    frame_data = df[df['timestamp'] <= frame]
    players = frame_data[['field_x', 'field_y']]
    balls = frame_data[['ball_position_x', 'ball_position_y']]
    
    plt.clf() 
    field.draw(display_range='full') 
    p = field.scatter(players['field_x'], players['field_y'])
    b = field.scatter(balls['ball_position_x'], balls['ball_position_y'])
    
    return p, b


fig = plt.figure()

ani = FuncAnimation(fig, update, frames=np.linspace(df['timestamp'].min(), df['timestamp'].max(), num=100), blit=True)

plt.show()

I would like it to output a baseball field with the scatter points moving as time increases.


Solution

  • There are few problems in code:

    Using field.draw() inside update() it tries to create many plots which slows down all program. But in matplotlib you can create it only once - outside update()

    It creates two plots - one with fields, and second with plot and data. And even example on homepage sportypy (in section Adding Analyses and Plotting Data) uses

    fig, ax = plt.subplots(1, 1)
    phf.draw(ax=ax)
    

    which creates fig and ax before draw() so it can uses this ax in draw() But for me it creates white background with green triangle instead of green background with green triangle - and solution was gcf() (get current figure) which gets fig created (automatically) by field.draw()

    field.draw(display_range='full') 
    fig = plt.gcf()
    

    The last problem is plt.clf() which removes all from plot - so it removes field and it draws normal scatter with white background and axis. I removed it and now it shows data on field.


    Full working code:

    I added ani.save('animation.gif', writer='imagemagick', fps=2)
    to write it in animated GIF. It needs external program imagemagick

    I added np.random.seed(0) so everyone will test code on the same values.

    I added colors to players using player_position, and red to all balls.

    For test I used num=10 instead of num=100 in FuncAnimation() to run it faster.

    import pandas as pd
    from sportypy.surfaces import MiLBField
    import matplotlib.pyplot as plt
    from matplotlib.animation import FuncAnimation
    import numpy as np
    
    # The dimensions are not exactly like this but this is an example if you need something to go off of
    
    num_rows = 50
    
    np.random.seed(0)  # it will always generate the same data - so it is simpler to compare them 
    
    data = {
        'game_str': ['game_01'] * num_rows,
        'play_id': [10] * num_rows,
        'timestamp': np.random.randint(180000, 181000, size=num_rows),
        'player_position': np.random.randint(1, 11, size=num_rows),
        'field_x': np.random.uniform(-150, 150, size=num_rows),
        'field_y': np.random.uniform(-150, 150, size=num_rows),
        'ball_position_x': np.random.uniform(0.0, 2.0, size=num_rows),
        'ball_position_y': np.random.uniform(0.0, 300.0, size=num_rows),
        'ball_position_z': np.random.uniform(0.0, 10.0, size=num_rows)
    }
    
    df = pd.DataFrame(data).sort_values(by='timestamp')
    
    field = MiLBField()
    
    # it shows white background with green triangle 
    #fig, ax = plt.subplots(1, 1)  # get figure before drawing
    #field.draw(display_range='full', ax=ax)
    
    # it shows green background with green triangle 
    field.draw(display_range='full')  # without ax=
    fig = plt.gcf()  # get figure after drawing
    
    def update(frame):
        print(f'frame: {frame:.2f}')
    
        frame_data = df[ df['timestamp'] <= frame ]
        #frame_data = df[ df['timestamp'] <= frame ].drop_duplicates(subset=['player_position'], keep='last')
        print('len(frame_data):', len(frame_data))
        
        players = frame_data  # no need [['field_x', 'field_y']]
        balls   = frame_data  # no need [['ball_position_x', 'ball_position_y']]
        #players = frame_data.drop_duplicates(subset=['player_position'], keep='last')
        print('len(players), len(balls):', len(players), len(balls))
    
        players_colors = players['player_position']
        balls_colors   = ['red'] * len(balls)
        
        p = field.scatter(players['field_x'], players['field_y'], c=players_colors)
        b = field.scatter(balls['ball_position_x'], balls['ball_position_y'], c=balls_colors)
        
        return p, b
    
    ani = FuncAnimation(fig, update, frames=np.linspace(df['timestamp'].min(), df['timestamp'].max(), num=10), blit=True)
    
    ani.save('animation.gif', writer='imagemagick', fps=2)
    
    plt.show()
    

    enter image description here


    But this has another "problem" - df['timestamp'] <= frame - it shows all positions from the beginnig. Maybe it would need to use previous_frame <= df['timestamp'] <= frame to show only last positions. But this removes all objects when there is no data between previous_frame, frame.

    previous_frame = None
    
    def update(frame):
        global previous_frame 
        
        print(f'frame: {frame:.2f}')
    
        frame_data = df[ df['timestamp'] <= frame ]
    
        if previous_frame is None or previous_frame > frame:    
            mask1 = (df['timestamp'] <= frame)
            frame_data = df[ mask1 ]
        else:
            mask1 = (df['timestamp'] <= frame)
            mask2 = (df['timestamp'] > previous_frame)
            frame_data = df[ mask1 & mask2 ]
        
        previous_frame = frame    
        
        players = frame_data  # no need [['field_x', 'field_y']]
        balls   = frame_data  # no need [['ball_position_x', 'ball_position_y']]
        
        p = field.scatter(players['field_x'], players['field_y'])
        b = field.scatter(balls['ball_position_x'], balls['ball_position_y'])
        
        return p, b
    

    Maybe it would need to filter data by play_id or player_position and keep only last position for every play_id or player_position.

    Something like:

    frame_data = df[ df['timestamp'] <= frame ].drop_duplicates(subset=['player_position'], keep='last')
    

    Or maybe filter only players but keep all balls

    frame_data = df[ df['timestamp'] <= frame ]
    
    players = frame_data.drop_duplicates(subset=['player_position'], keep='last')  # no need [['field_x', 'field_y']]
    
    balls   = frame_data  # no need [['ball_position_x', 'ball_position_y']]