Search code examples
pythonsqlalchemybulk

SQLAlchemy Bulk Select to find existing records


I'm playing with some sports data and have a dictionary of matches whose keys are unique identifiers of the match (date, home team id, away team id) (all cols of the matches table). I want to query the matches table in bulk and find all the keys from the dict that already are in the database. I want to do this, but for all the keys in the dict.

[(date, home_team, away_team), (date, home_team, away_team), ... N] = matches.keys()

*** only one match
Session.query(Match).filter_by(date=date, home_team=home_team, away_team=away_team)

edit* So, as suggested I tried the following query:

ins_matches = Session.query(Match).filter(tuple_(Match.date, Match.home_team, Match.away_team
                    ).in_(unq_keys)).all()

unq_keys is a list of tuples with (date, home_team_id, away_team_id)

ProgrammingError: (psycopg2.ProgrammingError) operator does not exist: boolean = integer
LINE 3: ...es.home_team_id, teams.id = matches.away_team_id) IN (('2014...
                                                             ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
 [SQL: 'SELECT matches.id AS matches_id, matches.date AS matches_date, matches.time AS matches_time, matches.league_id AS matches_league_id, matches.type AS matches_type, matches.home_team_id AS matches_home_team_id, matches.away_team_id AS matches_away_team_id, matches.stadium_id AS matches_stadium_id, matches.attendance AS matches_attendance, matches.duration AS matches_duration, matches.home_team_odds AS matches_home_team_odds, matches.away_team_odds AS matches_away_team_odds, matches.result AS matches_result \nFROM matches, teams \nWHERE (matches.date, teams.id = matches.home_team_id, teams.id = matches.away_team_id) IN ((%(param_1)s, %(param_2)s, %(param_3)s), (%(param_4)s, %(param_5)s, %(param_6)s), (%(param_7)s, %(param_8)s, %(param_9)s), (%(param_10)s, %(param_11)s, %(param_12)s), (%(param_13)s, %(param_14)s, %(param_15)s))'] [parameters: {'param_15': 17, 'param_11': 14, 'param_6': 18, 'param_5': 12, 'param_1': '2014-10-29', 'param_8': 11, 'param_4': '2014-10-28', 'param_13': '2014-10-28', 'param_7': '2014-10-28', 'param_14': 19, 'param_9': 20, 'param_10': '2014-10-29', 'param_2': 16, 'param_3': 15, 'param_12': 13}]

Solution

  • You can use a tuple_ expression combined with in_:

    session.query(Match).filter(tuple_(Match.date, Match.home_team, Match.away_team).in_(list(matches.keys())))
    

    Note that this depends on RDBMS support for composite IN constructs.