Search code examples
phpperformanceiocat

PHP performance file_get_contents() vs readfile() and cat


I am doing some benchmarking with PHP file reading functions just for my overall knowledge. So I tested three different ways to read the whole content of a file that I thought would be very fast.

  • file_get_contents() well-know for its very high performance
  • readfile() known to be a very good alternative to file_get_contents() when it comes to outputting the data directly to stdout
  • exec('cat filename') one very handy and fast UNIX command

So here is my benchmarking code, note that I enabled the PHP cache system for readfile() to avoid the direct output that would totally falsify the results.

<?php
/* Using a quick PNG file to benchmark with a big file */

/* file_get_contents() benchmark */
$start = microtime(true);
$foo = file_get_contents("bla.png");
$end = microtime(true) - $start;
echo "file_get_contents() time: " . $end . "s\n";

/* readfile() benchmark */
ob_start();
$start = microtime(true);
readfile('bla.png');
$end = microtime(true) - $start;
ob_end_clean();
echo "readfile() time: " . $end . "s\n";

/* exec('cat') benchmark */
$start = microtime(true);
$bar = exec('cat bla.png');
$end = microtime(true) - $start;
echo "exec('cat filename') time: " . $end . "s\n";
?>

I have ran this code several times to confirm the results shown and every time I had the same order. Here is an example of one of them:

$ php test.php
file_get_contents() time: 0.0006861686706543s
readfile() time: 0.00085091590881348s
exec('cat filename') time: 0.0048539638519287s

As you can see file_get_contents() comes first then arrives readfile() and finally cat

As for cat even though it is a UNIX command (so fast and everything :)) I understand that calling a separate binary may cause the relative high result. But the thing I have some difficulty to understand is that why is file_get_contents() faster than readfile()? That's about 1.3 times slower after all.

Both functions are built-in and therefore pretty well optimized and since I enabled the cache, readfile() is not "trying" to output the data to stdout but just like file_get_contents() it will put the data inside the RAM.

I am looking for a technical low-level explanation here to understand the pros and cons of file_get_contents() and readfile() besides the fact that one is designed to write directly to stdout whereas the other does a memory allocation inside the RAM.

Thanks in advance.


Solution

  • file_get_contents only loads the data from the file in memory, while both readfile and cat also output the data on the screen, so they just perform more operations.

    If you want to compare file_get_contents to the others, add echo before it

    Also, you are not freeing the memory allocated for $foo. There is a chance that if you move the file_get_contents as last test, you will get different result.

    Additionally, you are using output buffering, which also cause some difference - just try to add the rest of the functions in an output buffering code to remove any differences.

    When comparing different functions, the rest of the code should be the same, otherwise you are open to all kinds of influences.