What is the best way to sanitize a SQL to prevent injection when using python? I'm using mysql-connector. I have read that I should use a structure similar to:
import mysql.connector
connection = mysql.connector.connect(host="", port="", user="", password="", database="")
cursor = connection.cursor( buffered = True )
sql = "INSERT INTO mytable (column1, column2) VALUES (%s, %s)"
val = (myvalue1, myvalue2)
cursor.execute(sql, val)
connection.commit()
However, I don't understand why this can prevent an injection. Is this sufficient? A user could introduce me anything on myvalue1 or myvalue2, even if it is not suposed to. Is there any useful library?
SQL injection works when untrusted input is interpolated into an SQL query and the input contains characters that change the syntax of the query.
Query parameters are kept separate from the SQL query, never interpolated into it. The values of the parameters are combined with the SQL query after it is parsed, so there is no longer any opportunity to change the syntax. The parameter is guaranteed to be treated as a single scalar value (i.e. as if it's just a string literal in an SQL expression).
This is the way the Python connector works if you use the MySQLCursorPrepared
cursor subclass. See https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursorprepared.html
Otherwise, the Python connector "simulates" prepared queries. It actually does interpolate parameters into the SQL query before it is parsed, but it does so safely, by escaping special characters that would cause SQL injection. It is well-tested so it's reliable.
Both cursor types are used the same way, passing an SQL query string with %s
placeholders, and another argument with a tuple of parameter values. You are using it correctly.
Re comment from @Learningfrommasters:
Yes, a string stored in your database can be used unsafely in another SQL query, and cause SQL injection. Some people think that only user input must be treated safely, but this is not true. Any variable should be treated as a query parameter, whether the value for that variable comes from user input, or read from a file, or even pulled out of your own database.
Example: Suppose my name is Bill O'Karwin. It has an apostrophe in it, which you know is a special character to SQL because it terminates a string literal.
If my name were stored in the database and then fetched into an application into a variable userlastname
, then I could search for other people with the same last name:
sql = f"SELECT * FROM Users WHERE lastname = '{userlastname}'"
That is unsafe because the apostrophe would cause SQL injection. Even though the value didn't come directly from user input, it came from my own database.
So use parameters for all variables. Then you don't have to think about whether the source is safe or not.
sql = "SELECT * FROM Users WHERE lastname = %s"
cur.execute(sql, (userlastname,))