python pandas dataframe matplotlib analytics

Problem code football analytics [Python] Noob

I have been learning code (football data analytics) in order to use this skill in football analytics.

I have asked before and all the other issues have been solved but I don´t know how the lambda funtion works in the code linked bellow.

https://stackoverflow.com/a/62039153/13621874

The issue is with this lambda function. I have tried and it´s not working, and I don´t know how to solve it. Without this, the filters don´t work.

Please can someone help me:

## pass_comp: completed pass
## pass_no: unsuccessful pass

## iterating through the pass dataframe
for row_num, passed in pass_df.iterrows():   

    if passed['player_name'] == player_name:
        ## for away side
        x_loc = passed['location'][0]
        y_loc = passed['location'][1]

        pass_id = passed['id']
        summed_result = sum(breceipt_df.iloc[:, 14].apply(lambda x: pass_id in x))

        if summed_result > 0:
            ## if pass made was successful
            color = 'blue'
            label = 'Successful'
            pass_comp += 1
        else:
            ## if pass made was unsuccessful
            color = 'red'
            label = 'Unsuccessful'
            pass_no += 1

        ## plotting circle at the player's position
        shot_circle = plt.Circle((pitch_length_X - x_loc, y_loc), radius=2, color=color, label=label)
        shot_circle.set_alpha(alpha=0.2)
        ax.add_patch(shot_circle)

        ## parameters for making the arrow
        pass_x = 120 - passed['pass_end_location'][0]
        pass_y = passed['pass_end_location'][1] 
        dx = ((pitch_length_X - x_loc) - pass_x)
        dy = y_loc - pass_y

        ## making an arrow to display the pass
        pass_arrow = plt.Arrow(pitch_length_X - x_loc, y_loc, -dx, -dy, width=1, color=color)

        ## adding arrow to the plot
        ax.add_patch(pass_arrow)

Thanks in advance for your help!

Solution

The lambda function is checking to see if the pass_id for Messi is also located in the breceipt_df column 'related_events'. If it is it will return atleast 1 True row. So the sum of True will be greater than 0, which is indicating it was a successful pass. If there are no True , then the sum will not be greater than 0, thus it'll record it as an unsuccessful pass.

So it's just checking if both IDs are present. I changed it slightly to instead using a lambda function, to just simply check to see if pass_id is in the list of related_events column. The column has nested lists, so that will need to be flattened (which I do in the code)

So try putting this in it's place:

## pass_comp: completed pass
## pass_no: unsuccessful pass

## iterating through the pass dataframe
for row_num, passed in pass_df.iterrows():   

    if passed['player_name'] == player_name:
        ## for away side
        x_loc = passed['location'][0]
        y_loc = passed['location'][1]

        pass_id = passed['id']
       
        ######### ALTERED CODE ###################
        events_list = [item for sublist in breceipt_df['related_events'] for item in sublist]
        if pass_id in events_list:
            ## if pass made was successful
            color = 'blue'
            label = 'Successful'
            pass_comp += 1
        else:
            ## if pass made was unsuccessful
            color = 'red'
            label = 'Unsuccessful'
            pass_no += 1
       ########################################    


        ## plotting circle at the player's position
        shot_circle = plt.Circle((pitch_length_X - x_loc, y_loc), radius=2, color=color, label=label)
        shot_circle.set_alpha(alpha=0.2)
        ax.add_patch(shot_circle)

        ## parameters for making the arrow
        pass_x = 120 - passed['pass_end_location'][0]
        pass_y = passed['pass_end_location'][1] 
        dx = ((pitch_length_X - x_loc) - pass_x)
        dy = y_loc - pass_y

        ## making an arrow to display the pass
        pass_arrow = plt.Arrow(pitch_length_X - x_loc, y_loc, -dx, -dy, width=1, color=color)

        ## adding arrow to the plot
        ax.add_patch(pass_arrow)

Full Code

import matplotlib.pyplot as plt
import json
from pandas.io.json import json_normalize
from FCPython import createPitch

## Note Statsbomb data uses yards for their pitch dimensions
pitch_length_X = 120
pitch_width_Y = 80

## match id for our El Clasico
#match_list = [16205, 16131, 16265]
match_list = ['16157']
teamA = 'Barcelona'  #<--- adjusted here

for match_id in match_list:
    ## calling the function to create a pitch map
    ## yards is the unit for measurement and
    ## gray will be the line color of the pitch map
    (fig,ax) = createPitch(pitch_length_X, pitch_width_Y,'yards','gray') #< moved into for loop

    player_name = 'Lionel Andrés Messi Cuccittini'

    ## this is the name of our event data file for
    ## our required El Clasico
    file_name = str(match_id) + '.json'

    ## loading the required event data file
    ## Adjust path to your events folder
    my_data = json.load(open('Statsbomb/open-data-master/data/events/' + file_name, 'r', encoding='utf-8'))


    ## get the nested structure into a dataframe 
    ## store the dataframe in a dictionary with the match id as key
    df = json_normalize(my_data, sep='_').assign(match_id = file_name[:-5])
    teamB = [x for x in list(df['team_name'].unique()) if x != teamA ][0] #<--- get other team name

    ## making the list of all column names
    column = list(df.columns)

    ## all the type names we have in our dataframe
    all_type_name = list(df['type_name'].unique())

    ## creating a data frame for pass
    ## and then removing the null values
    ## only listing the player_name in the dataframe
    pass_df = df.loc[df['type_name'] == 'Pass', :].copy()
    pass_df.dropna(inplace=True, axis=1)
    pass_df = pass_df.loc[pass_df['player_name'] == player_name, :]

    ## creating a data frame for ball receipt
    ## removing all the null values
    ## and only listing Barcelona players in the dataframe
    breceipt_df = df.loc[df['type_name'] == 'Ball Receipt*', :].copy()
    breceipt_df.dropna(inplace=True, axis=1)
    breceipt_df = breceipt_df.loc[breceipt_df['team_name'] == 'Barcelona', :]

    pass_comp, pass_no = 0, 0
    ## pass_comp: completed pass
    ## pass_no: unsuccessful pass
    
    ## iterating through the pass dataframe
    for row_num, passed in pass_df.iterrows():   
    
        if passed['player_name'] == player_name:
            ## for away side
            x_loc = passed['location'][0]
            y_loc = passed['location'][1]
    
            pass_id = passed['id']
           
            events_list = [item for sublist in breceipt_df['related_events'] for item in sublist]
            if pass_id in events_list:
                ## if pass made was successful
                color = 'blue'
                label = 'Successful'
                pass_comp += 1
            else:
                ## if pass made was unsuccessful
                color = 'red'
                label = 'Unsuccessful'
                pass_no += 1
    
            ## plotting circle at the player's position
            shot_circle = plt.Circle((pitch_length_X - x_loc, y_loc), radius=2, color=color, label=label)
            shot_circle.set_alpha(alpha=0.2)
            ax.add_patch(shot_circle)
    
            ## parameters for making the arrow
            pass_x = 120 - passed['pass_end_location'][0]
            pass_y = passed['pass_end_location'][1] 
            dx = ((pitch_length_X - x_loc) - pass_x)
            dy = y_loc - pass_y
    
            ## making an arrow to display the pass
            pass_arrow = plt.Arrow(pitch_length_X - x_loc, y_loc, -dx, -dy, width=1, color=color)
    
            ## adding arrow to the plot
            ax.add_patch(pass_arrow)

    ## computing pass accuracy
    pass_acc = (pass_comp / (pass_comp + pass_no)) * 100
    pass_acc = str(round(pass_acc, 2))

    ## adding text to the plot
    plt.suptitle('{} pass map vs {}'.format(player_name, teamB), fontsize=15) #<-- make dynamic and change to suptitle
    plt.title('Pass Accuracy: {}'.format(pass_acc), fontsize=15) #<-- change to title

    ## handling labels
    handles, labels = plt.gca().get_legend_handles_labels()
    by_label = dict(zip(labels, handles))
    plt.legend(by_label.values(), by_label.keys(), loc='best', bbox_to_anchor=(0.9, 1, 0, 0), fontsize=12)

    ## editing the figure size and saving it
    fig.set_size_inches(12, 8)
    fig.savefig('{} passmap.png'.format(match_id), dpi=200)  #<-- dynamic file name

    ## showing the plot
    plt.show()