Search code examples
phpconfigurationmbstring

What does mbstring.strict_detection do?


The mbstring PHP module has a strict_detection setting, documented here. Unfortunately, the manual is completely useless; it only says that this option "enables the strict encoding detection".

I did a few tests and could not find how any of the mbstring functions are affected by this. mb_check_encoding() and mb_detect_encoding() give exactly the same result for both valid and invalid UTF-8 input.

(edit:) The mbstring.strict_detection option was added in PHP 5.1.2.


Solution

  • Without the strict parameter being set, the encoding detection is faster but will not be as accurate. For example, if you had a UTF-8 string with partial UTF-8 sequence like this:

    $s = "H\xC3\xA9ll\xC3";
    $encoding = mb_detect_encoding($s, mb_detect_order(), false);
    

    The result of the mb_detect_encoding call would still be "UTF-8" even though it's not valid UTF-8 (the last character is incomplete).

    But if you set the strict parameter to true...

    $s = "H\xC3\xA9ll\xC3";
    $encoding = mb_detect_encoding($s, mb_detect_order(), true);
    

    It would perform a more thorough check, and the result of that call would be FALSE.