Search code examples
pythonpymysql

python: how to query multiple mysql databases using pandas.read_sql_query


As mentioned above I need to query several databases with all the same schema using pandas.read_sql_query. I've tried to create loop over all the dbs and run on the fly the sql statement that would result in something like this:

USE db_test_1; SELECT * from test

That's what I've done so far:

cursor = conn.cursor()
cursor.execute("SHOW DATABASES LIKE '%test_%'")
cursor.close()
dbs = [v.replace("'", "") for (v, ) in cursor]

for db in dbs[:100]:
    temp = "USE " + db + ";"
    fd = open('my_query.sql')
    query = fd.read()
    fd.close
    sql = temp + query
    data = pd.read_sql_query(sql, conn)
    print(data)

Gives an error saying that the mysql syntax is wrong. Do you have any idea how to handle it or point me to the error?

Many thanks


Solution

  • Your problem lies with your my_query.sql file.

    SELECT (SELECT * from tab1), (SELECT * from tab2)
    

    The above is not valid SQL; a subselect can only return a single column. To do this, you would need to join the two subselects in the FROM clause. Which columns you do this on will be entirely dependent on your schema and the needed relation.

    Update:

    Okay, so the problem here seems to be more about how you're dealing with the query. Your cursor object is connected to a single database. Not the entire database server.

    That means that your cursor object cannot use the use keyword here. You need to create a new connection and cursor object for each database you want to connect to.