Search code examples
ruby-on-railssecuritymime-types

Mime type spoofing detection


I am working on a content type spoof detector for a web application. My issue can be answered by any developer with experience on this subject.

My input is a object, which expose its filename, content_type, and io. The object content_type is determined by a lib called Marcel, the content_type is based on a reducing of the most specific guessed mime_type using the io, filename, and the file extension.

The issue is that, using the Marcel lib this way, the content_type can be spoofed (that's why I am building this detector). Using a spoofed jpg with a text/plain content, but a image/jpg content_type and a .jpg extension will return image/jpg.

To solve this, I am analyzing the object io with the linux file command to determine the 'real' content_type. But there is an issue doing things this way. The file command will sometimes return a content_type that will not be precise enough or can be an alias for the object provided content_type.

For example, for .wmv files, Marcel, using the io + filename + extension will be able to determine a video/x-ms-wmv content_type. Whereas, the file command will return a video/x-ms-asf content_type. Which corresponds to a kind of parent of video/x-ms-wmv. Second example, for .avi files Marcel will return video/vnd.avi wherase the file command will return video/x-msvideo, which is an alias for this content_type. In both cases, these content_types are not equal, but both could be deemed as 'valid' pairs.

The thing is, doing things with way, I need a kind of mapping of these pair values. The thing I am asking SO, is : is building this content_type mapping an already done task? if not, does anyone know if it's a complex task? I guess so since they are 1000s of content_types nowadays...

Depending of your answer I might switch to a less precise method by only performing a detection based on the type (ie image/video/application/...) rather than the whole mime type. This might be enough, validating that the client sends .jpg, having .png will not be such an issue, whereas this detector will prevent .exe files since their type is application and not image.

If someone has any experience on this kind of subject, let me know,


Solution

  • marcel includes many of these type mappings for instance

    ext = 'wmv'
    types_by_extension = Marcel::TYPE_EXTS.filter_map {|k,v| k if v.include?('wmv') } 
    #=>  ["video/x-ms-wmv"]
    types_by_extension.concat( 
      *types_by_extension.filter_map do |type| 
        Marcel::TYPE_PARENTS[type]
      end 
    ) 
    #=> ["video/x-ms-wmv", "video/x-ms-asf"]
    

    If you have more you'd like to add you can use the interface provided by Marcel::MimeType.

    The signature for Marcel::MimeType#extend is:

    extend(type, extensions: [], parents: [], magic: nil)
    

    So for instance when I run the above with 'avi' I only receive ["video/x-msvideo"] so in order to add 'video/vnd.avi' a simple extension only option would be to add:

    Marcel::MimeType.extend "video/vnd.avi", extensions: %(avi)
    

    Or possibly even using the "magic" parameter:

    Marcel::MimeType.extend "video/vnd.avi", extensions: %(avi), magic:[[0, "RIFF", [[8, "AVILIST"]]]] 
    

    Out of the box MimeType definitions can be found Here, Here and Here