Search code examples
phpbinary-datasubstr

Why does PHP substr() change ASCII carriage return byte?


I was going to use a long string to manipulate a large number of bit flags, keeping the result string in Redis. However, stumbled upon a php bug (?). A byte that contains bits 00001101 read using substr() returns an unexpected value:

$bin = 0b00001101;  // 13 - ASCII Carriage return
$c = substr($bin, 0, 1);    // read this character
printf("Expectation: 00001101, reality: %08b\n", $c); // 00000001

Ideone

Is the assumption that substr() is binary-safe wrong? Also tried mb_substr(), setting encoding to 8bit with exactly the same result.


Solution

  • You're setting $bin to an integer 13

    Using substr() against $bin is casting $bin to a string ("13")

    You're reading the first character of that string ("1")

    Using printf() with %b, you're explicitly casting that string back to an integer 1

    the argument is treated as an integer, and presented as a binary number.

    EDIT

    This code should give the result that you're expecting

    $bin = 0b00001101;  // 13 - ASCII Carriage return
    $c = substr(chr($bin), 0, 1);    // read this character
    printf("Expectation: 00001101, reality: %08b\n", ord($c)); // 00001101