Search code examples
phpspecial-charactersstrcmp

Compare strings with middle dot not working in PHP


I get a string from the database, where it is encoded with utf8_unicode_ci. It might contain the middle dot character (⋅) and I have to find out using strcmp. If I show the string in the HTML directly, the character is displayed without problem but when I do the comparison, the results is not what I expect.

For example:

$string = "⋅⋅⋅ This string starts with middle dots";
$result = strcmp(substr($string , 0, 2), "⋅⋅");

The results is not 0, as I think should be. The PHP file is saved with UTF-8 encoding. What am I missing here? This happens even if I take the string from a variable instead of the database


Solution

  • PHP's substr does not take unicode characters as a single character.

    The dot you're using is actually 3 characters, 0xE2 0x8B 0x85.

    So either use mb_substr, or use a different offset:

    <?php
    
    $string = "⋅⋅⋅ This string starts with middle dots";
    $result = strcmp(mb_substr($string , 0, 2), "⋅⋅");
    
    var_dump($result);
    

    Or if mb_* functions don't exist:

    <?php
    
    $string = "⋅⋅⋅ This string starts with middle dots";
    $result = strcmp(substr($string , 0, 6), "⋅⋅");
    
    var_dump($result);