Search code examples
sqlpostgresqlpsycopg2python-3.7postgresql-12

Select rows from database using varying multiple WHERE conditions


I have a map with lots drawn on it, when a person select one or more lots I want to grab the information of those lots from the database and return it. The lots are identified by IDs like the ones in "lots_list". At the moment I'm using a for loop to iterate over the list and fetch the data, passing the ID to a where clause with a placeholder, but the execution is fairly slow this way.

def getLotInfo(lots_list):

    lots = []
    for lot in lots_list:
        try:
            connection = psycopg2.connect(user=user,
                                          password=password,
                                          host=host,
                                          port=port,
                                          database=database)
            cursor = connection.cursor()

            Psql_Query = '''SELECT setor, quadra, lote, area_ocupada FROM iptu_sql_completo WHERE sql 
                            LIKE %s'''

            cursor.execute(Psql_Query, (lot,))
            lots.append(cursor.fetchone())
            print(lots)

        except (Exception, psycopg2.Error) as error:
            print("Error fetching data from PostgreSQL table", error)

        finally:
            # closing database connection.
            if (connection):
                cursor.close()
                connection.close()
                print("PostgreSQL connection is closed")

    return lots


lots_list = ["0830480002", "0830480003", "0830480004"]

Lots = getLotInfo(lots_list)

I tried to use psycopg2 execute_batch command

        Psql_Query = '''SELECT setor, quadra, lote, area_ocupada FROM 
                        iptu_sql_completo WHERE sql LIKE %s'''

        ppgextra.execute_batch(cursor, Psql_Query, SQLs)

        lots.append(cursor.fetchall())
        print(lots)

but I get the following error "not all arguments converted during string formatting" I am imagining that's because I should use a placeholder in the query for every item in the list, but if the list is ever changing in size, would there be a way to fix this? The IDs are not always sequential.

My question is: Is there a way to achieve better performance than using the for loop?


Solution

  • Your current code is pretty much the worst case I had in mind here:

    Maurice already mentioned the repeated connection overhead. But even with a single connection, this is far from ideal. Instead, run a single query and pass the whole list lots_list as Postgres array:

    SELECT setor, quadra, lote, area_ocupada
    FROM   iptu_sql_completo
    WHERE  sql = ANY (%s);
    

    Related: