Search code examples
pythonpandasdataframecomparison

Find and compare the last and the row before in a group it if it matches the criteria in python


I have a dataframe that contains game data. For this game 2 players play a 90 minute game. However the game can last longer than 90 minutes for various reasons. What I want is to find games that player 1 won after the 90th minute mark. So I'd like the compare the end score for a game that: - lasted more than 90 minutes, - Player 1 won with the previous score for the time<90 minutes.

games dataframe
(source: imggmi.com)

  # Games dataframe
  games = pd.DataFrame({'game_id': {0: 1,1: 1,2: 1,3: 2,4: 2,5: 3,6: 3,7: 
                     3,8: 4,9: 4,10: 4,11: 5,12: 5,13: 5},
                    'time': {0: 1,1: 45,2: 95,3: 56,4: 80,5: 1,6: 95,7: 
                     95,8: 96,9: 107,10: 108,11: 15,12: 95,13: 97},
                    'player 1': {0: 1,1: 1,2: 2,3: 1,4: 1,5: 0,6: 1,7: 2,8: 
                     0,9: 1,10: 2,11: 1,12: 1,13: 1},
                    'player 2': {0: 0,1: 1,2: 1,3: 0,4: 1,5: 1,6: 1,7: 1,8: 
                     1,9: 1,10: 1,11: 0,12: 1,13: 2}})

  # Find the rows with the ending scores of games
  a = games.drop_duplicates(["game_id"],keep='last')

  #Find games that player 1 wins and time>90
  b = games[((games["player 1"] - games["player 2"]) >0) (games["time"]>90)]

For example for game 1 Player 1 won at 95th minute before that the scores were even. Overall if the situation was a tie between 2 players before 90th minute or Player 1 was losing but after 90 minute mark the final situation is that Player 1 won. How can I filter this case ?


Solution

  • Edit: How about this?

    endScores = games.loc[(games['time'] > 90) & (games['player 1'] > games['player 2'])].groupby('game_id').nth(-1)
    beforeScores = games.loc[(games['time'] <= 90) & (games['player 1'] <= games['player 2'])].groupby('game_id').nth(-1)
    compareGames = beforeScores.join(endScores, rsuffix='_end').dropna()