Search code examples
phpmysqlsqlsql-injection

Is there a better way to combat SQL Injection?


I've watched Computerphile's video many times on this subject(for any of you who want, this is the link: https://www.youtube.com/watch?v=_jKylhJtPmI). He provides some really good advice on how to combat SQL Injection and make your app more effective. These are the key points from his video:

  1. Don't use straight and unprotected SQL commands because this is the way hackers can perform a SQL Injection, stealing, modifying, or even deleting your data.
  2. A good approach is to use the mysql_real_escape_string(String s) function. This basically places on the start of every dangerous character (/,", {, }, etc) a slash (/). So basically this makes the quote or slash inside the string useless.
  3. The best thing to do is to use prepared statements. So, you basically say:

    SELECT * FROM USERS WHERE username = ?
    

    Later you go and replace the question mark with the string you want to input as the user name. This has the advantage of not confusing PHP or any other fault-tolerant language, and using this simple and (kind of, hacky) elegant solution to just say replace this with the string and tell the language that what is given is just a string and nothing more than that.

That is good and all, but this video is really outdated. It was made way back in 2013 and since then a lot of new technology has emerged. So, I tried to search the internet to find if there were any new approaches or if this is the one. But the problem was that either I couldn't find it or either I found something that was super confusing.

So, my question is: Is there a better and enhanced way to combat SQL Injection that has been introduced, or if prepared statements are still the norm and if they are vulnerable to any kind of attack?


Solution

  • Parameter binding is still the best solution in most examples of combining dynamic data with an SQL query.

    You should understand why. It's NOT just doing a string substitution for you. You could do that yourself.

    It works because it separates the dynamic value from the SQL-parsing step. The RDBMS parses the SQL syntax during prepare():

    $stmt = $pdo->prepare("SELECT * FROM USERS WHERE username = ?");
    

    After this point, the RDBMS knows that the ? must only be a single scalar value. Not anything else. Not a list of values, not a column name, not an expression, not a subquery, not a UNION to a second SELECT query, etc.

    Then you send the value to be bound to that placeholder in the execute step.

    $stmt->execute( [ "taraiordanov" ] );
    

    The value is sent to the RDBMS server, and it takes its place in the query but only as a value and then the query can be executed.

    This is allows you to execute the query multiple times with different values plugged in. Even though the SQL parser only needed to parse the query once. It remembers how to plug a new value into the original prepared SQL query, so you can execute() as many times as you want:

    $stmt->execute( [ "hpotter" ] );
    $stmt->execute( [ "hgranger" ] );
    $stmt->execute( [ "rweasley" ] );
    ...
    

    Are prepared statements the best? Yes, they are. It doesn't matter that the advice comes from 2013, it's still true. Actually, this feature about SQL dates back a lot further than that.

    So are query parameters the foolproof way of defending against SQL injection? Yes they are, if you need to combine a variable as a value in SQL. That is, you intend for the parameter to substitute in your query where you would otherwise use a quoted string literal, a quoted date literal, or a numeric literal.

    But there are other things you might need to do with queries too. Sometimes you need to build an SQL query piece by piece based on conditions in your application. Like what if you want to do a search for username but sometimes also add a term to your search for last_login date? A parameter can't add a whole new term to the search.

    This isn't allowed:

    $OTHER_TERMS = "and last_login > '2019-04-01'";
    $stmt = $pdo->prepare("SELECT * FROM USERS WHERE username = ? ?");
    $stmt->execute( [ "taraiordanov", $OTHER_TERMS ] ); // DOES NOT WORK
    

    What if you want to allow the user to request sorting a result, and you want to let the user choose which column to sort by, and whether to sort ascending or descending?

    $stmt = $pdo->prepare("SELECT * FROM USERS WHERE username = ? ORDER BY ? ?");
    $stmt->execute( [ "taraiordanov", "last_login", "DESC" ] ); // DOES NOT WORK
    

    In these cases, you must put the column names and syntax for query terms into your SQL string before prepare(). You just have to be extra careful not to let untrusted input contaminate the dynamic parts you put in the query. That is, make sure it's based on string values you have complete control over in your code, not anything from outside the app, like user input or a file or the result of calling an API.


    Re comments:

    The idea Martin is adding is sometimes called whitelisting. I'll write out Martin's example in a more readable manner:

    switch ($_GET['order']) {
    case "desc": 
      $sqlOrder = "DESC"; 
      break; 
    default: 
      $sqlOrder = "ASC"; 
      break;
    }
    

    I replaced Martin's case "asc" with default because then if the user input is anything else -- even something malicious -- the only thing that can happen is that any other input will default to SQL order ASC.

    This means there are only two possible outcomes, ASC or DESC. Once your code has complete control over the possible values, and you know both values are safe, then you can interpolate the value into your SQL query.

    In short: always keep in your mind an assumption that $_GET and $_POST may contain malicious content. It's easy for a client to put anything they want into the request. They are not limited by the values in your HTML form.

    Write your code defensively with that assumption in mind.


    Another tip: Many people think that client input in $_GET and $_POST are the only inputs you need to protect against. This is not true! Any source of input can contain problematic content. Reading a file and using that in your SQL query, or calling an API, for example.

    Even data that has previously been inserted in your database safely can introduce SQL injection if you use it wrong.