Search code examples
c++rubybashcontent-typemime

How do I detect plaintext in a MIME file?


I have a large set of MIME files, which contain multiple parts. Many of the files contain parts labelled with the following headers:

Content-Type: application/octet stream

Content-Transfer-Encoding: Binary

However, sometimes the contents of these parts are some form of binary code, and sometimes they are plaintext.

Is there a clever way in either C++, Bash or Ruby to detect whether the contents of a MIME part labelled as application/octet stream is binary data or plaintext?


Solution

  • The simplest method is to split the file into a set of multiple files each of which contains one of the component parts. We can then use grep and other functions to ascertain the text format.