Search code examples
phphttpmimerfc2231

How can I encode a filename in PHP according to RFC 2231?


How can I encode the value of a filename according to the encoding of MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations (RFC 2231)?


Solution

  • I think this should do it:

    function rfc2231_encode($name, $value, $charset='', $lang='', $ll=78) {
        if (strlen($name) === 0 || preg_match('/[\x00-\x20*\'%()<>@,;:\\\\"\/[\]?=\x80-\xFF]/', $name)) {
            // invalid parameter name;
            return false;
        }
        if (strlen($charset) !== 0 && !preg_match('/^[A-Za-z]{1,8}(?:-[A-Za-z]{1,8})*$/', $charset)) {
            // invalid charset;
            return false;
        }
        if (strlen($lang) !== 0 && !preg_match('/^[A-Za-z]{1,8}(?:-[A-Za-z]{1,8})*$/', $lang)) {
            // invalid language;
            return false;
        }
        $value = "$charset'$lang'".preg_replace_callback('/[\x00-\x20*\'%()<>@,;:\\\\"\/[\]?=\x80-\xFF]/', function($match) { return rawurlencode($match[0]); }, $value);
        $nlen = strlen($name);
        $vlen = strlen($value);
        if (strlen($name) + $vlen > $ll-3) {
            $sections = array();
            $section = 0;
            for ($i=0, $j=0; $i<$vlen; $i+=$j) {
                $j = $ll - $nlen - strlen($section) - 4;
                $sections[$section++] = substr($value, $i, $j);
            }
            for ($i=0, $n=$section; $i<$n; $i++) {
                $sections[$i] = " $name*$i*=".$sections[$i];
            }
            return implode(";\r\n", $sections);
        } else {
            return " $name*=$value";
        }
    }
    

    Note that this function expects that the output is used in a separate line preceded by a proper line wrap (i.e. CRLF), e.g.:

    "Content-Type: application/x-stuff;\r\n".rfc2231_encode('title', 'This is even more ***fun*** isn\'t it!', 'us-ascii', 'en', 48)
    

    The output is:

    Content-Type: application/x-stuff;
     title*0*=us-ascii'en'This%20is%20even%20more%20;
     title*1=%2A%2A%2Afun%2A%2A%2A%20isn%27t%20it!
    

    See also Test Cases for HTTP Content-Disposition header field and the Encodings defined in RFC 2047 and RFC 2231/5987.