php text text-processing carriage-return file-read

Is PHP removing CR when reading text file?

A certain txt file contains only CRLF line breaks. It have been confirmed by opening the file in Notepad++ with "Show All Characters" enabled.

When reading the file with PHP, using file_get_contents(), or fopen(), the CR characters seems to be filtered out:

<?php
    ...
    $fh = fopen($path, 'r');

    while (!feof($fh)) {
        $string .= fread($fh, 1024);
    }

    preg_match_all('/\r/', $string, $matches);
    var_dump($matches);

    // 0 matches: array(1) { [0]=> array(0) { } }

    $string2 = file_get_contents($path);
    preg_match_all('/\r/', $string2, $matches2);
    var_dump($matches2);

    // 0 matches: array(1) { [0]=> array(0) { } }
?>

I am confused, because each mentioned function's documentation says nothing about this. Maybe there are other methods to open files in the exact they are stored.

Need confirmation about if these functions does filter out or "normalize" the CR characters. Is so, what else these functions might be "normalizing"? Is there a method to avoid that behaviour?

To be more explicit I need these CR characters and every bit to remain intact when I load the file into my variable.

Thank you

Solution

Yes, that's what fopen does depending on the parameters you give, and you can find it in the documentation: http://php.net/manual/en/function.fopen.php

Windows offers a text-mode translation flag ('t') which will transparently translate \n to \r\n when working with the file. In contrast, you can also use 'b' to force binary mode, which will not translate your data. To use these flags, specify either 'b' or 't' as the last character of the mode parameter.

i.e. you can avoid such "translation" by using the 'b' flag in the mode parameter. For example:

fopen($path, 'rb'); // Read in binary mode