Search code examples
phpmime

False MIME type detection in PHP


I have this code:

<?php
    if (isset($_POST)) {
        $finfo = finfo_open(FILEINFO_MIME_TYPE);
        $fileMime = finfo_file($finfo, $_FILES["file"]["tmp_name"]);
        finfo_close($finfo);
        echo $fileMime;
    }
?>
<form method="post" enctype="multipart/form-data">
    <input type="file" name="file">
    <button type="submit">Upload</button>
</form>

And in some cases, people have managed to fake the MIME type, it was a a .php file with application/octet-stream and the script detected it as image/jpeg.

Is there any way to fix this and detect the correct MIME type?

Thanks!


Solution

  • it was a a .php file

    Microsoft have you well conditioned. The file name is not determined by the content of the file. Its just a convention that people often use the former to indicate the latter. The same convention is used within most webservers to map handlers to files. Hence if you are daft enough to trust the data supplied by your users and store the files within the document root then users can upload arbitrary scripts to your server.

    As Marc B says, finfo_file reads the file and tries to match a variable number of bytes at the beginning of the file against a database of known file formats.

    I would guess that this file probably starts off looking like a JPEG.

    Once the webserver has decided to push the content through the PHP handler, PHP will ignore everything outside of, and execute anything within the <?php ... ?> tags.

    Have a look at the file in a hex editor and see what you find.

    Is there any way to fix this and detect the correct MIME type?

    To what end? finfo_file() provides a best guess. If you want to ensure the file is of the expected format, then you need code specifically designed to recognise and process that format (e.g. if user asserts file is a JPEG, load it with imagecreatefromjpeg()). If your system depends on the file extensions mapping to a specific mimetype in uploaded content, reject the content if it does not match. As Marc B has already pointed out, most file formats are composite structures which can contain just about anything (even an ascii text file can have embedded PHP). I.e. a file can be a valid JPEG and a valid PHP script.