Search code examples
pythonsqlitesql-injectionsqlcipher

How do you safely pass values to SQLite PRAGMA statements in Python?


I'm currently writing an application in Python that stores its data in a SQLite database. I want the database file to be stored encrypted on disk, and I found the most common solution for doing this to be SQLCipher. I added sqlcipher3 to my project to provide the DB-API, and got started. With SQLCipher, the database encryption key is provided in the form of a PRAGMA statement which must be provided before the first operation on the database is executed.

PRAGMA key='hunter2';  -- like this

When my program runs, it prompts the user for the database password. My concern is that since this is a source of user input, it's potentially vulnerable to SQL injection. For example, a naive way to provide the key might look something like this:

from getpass import getpass
import sqlcipher3

con = sqlcipher3.connect(':memory:')
cur = con.cursor()
password = getpass('Password: ')
cur.execute(f"PRAGMA key='{password}';")

### do stuff with the unencrypted database here

If someone was to enter something like "hunter2'; DROP TABLE secrets;--" into the password prompt, the resulting SQL statement would look like this after substitution:

PRAGMA key='hunter2'; DROP TABLE secrets;--';

Typically, the solution to this problem is to use the DB-API's parameter substitution. From the sqlite3 documentation:

An SQL statement may use one of two kinds of placeholders: question marks (qmark style) or named placeholders (named style). For the qmark style, parameters must be a sequence whose length must match the number of placeholders, or a ProgrammingError is raised. For the named style, parameters must be an instance of a dict (or a subclass), which must contain keys for all named parameters; any extra items are ignored. Here’s an example of both styles:

con = sqlite3.connect(":memory:")
cur = con.execute("CREATE TABLE lang(name, first_appeared)")

# This is the named style used with executemany():
data = (
    {"name": "C", "year": 1972},
    {"name": "Fortran", "year": 1957},
    {"name": "Python", "year": 1991},
    {"name": "Go", "year": 2009},
)
cur.executemany("INSERT INTO lang VALUES(:name, :year)", data)

# This is the qmark style used in a SELECT query:
params = (1972,)
cur.execute("SELECT * FROM lang WHERE first_appeared = ?", params)
print(cur.fetchall())

This works as expected in the sample code from the docs, but when using placeholders in a PRAGMA statement, we get an OperationalError telling us there's a syntax error. This is the case for both types of parameter substitution.

# these will both fail

cur.execute('PRAGMA key=?;', (password,))
cur.execute('PRAGMA key=:pass;', {'pass': password})

I'm not sure where to go from here. If we actually enter our malicious string at the password prompt, it won't work, producing the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlcipher3.ProgrammingError: You can only execute one statement at a time.

So is the "naive" code from earlier safe? I'm not confident saying the answer is "yes" just because the one malicious string I could come up with didn't work, but there doesn't seem to be a better way of doing this. The answers to the only other person on here asking this question that I could find suggested equivalent solutions (python + sqlite insert variable into PRAGMA statement). I'd also rather not use an ORM, especially if it's just for this one case. Any suggestions would be appreciated, thanks.


Solution

  • According to the accepted answer to “Python sqlite3 string variable in execute”, there are limitations on where DB-API substitutions can be used:

    Parameter markers can be used only for expressions, i.e., values. You cannot use them for identifiers like table and column names.

    Seeing this, I figured that arguments to PRAGMA must fall into the same category as “table and column names”. In fact, my specific use case was PRAGMA table_info, where the argument is a table name.

    On digging into it further, I found that Python’s sqlite3 module relies on SQLite’s own sqlite3_bind_* functions to do parameter substitutions. For example, here is the code for substituting string values. And I found further confirmation that substitution won’t work for PRAGMA arguments.

    “But wait,” I thought. “Sam’s argument is a key, not a table name.” Without digging even deeper, I can only conjecture that it doesn’t matter, and SQLite (or SQLCipher) just doesn’t allow binding values to PRAGMA statements.

    Maybe you can supply the key via SQLCipher’s C API instead of through SQL? It doesn’t fix my use case, but it might help with yours!

    For me, and for anyone else trying to programmatically provide a table name to PRAGMA table_info, I guess the official solution is to double- and triple-check that the variable cannot possibly contain user input, validate and escape it anyway just in case, cross fingers, toes, knees and nose, and do a string substitution! What could possibly go wrong…