I have a large set of MIME files, which contain multiple parts. Many of the files contain parts labelled with the following headers:
Content-Type: application/octet stream
Content-Transfer-Encoding: Binary
However, sometimes the contents of these parts are some form of binary code, and sometimes they are plaintext.
Is there a clever way in either C++, Bash or Ruby to detect whether the contents of a MIME part labelled as application/octet stream is binary data or plaintext?
The simplest method is to split the file into a set of multiple files each of which contains one of the component parts. We can then use grep and other functions to ascertain the text format.