This is a general question of sorts, but to explain it I will use a specific example.
I have a function that loads a document. If that document does not exist it will create it, if it does exist it will convert it to a JSON array. I always want this function to return an array of some sort, whether or not there is an issue with json_decode()
or if the file does not exist. Currently I am doing it like so...
function load($file) {
if( ! file_exists($file)) {
$handle = fopen($file, 'w');
fclose($handle);
}
$raw = file_get_contents($file);
$contents = json_decode($raw, TRUE);
return( ! $contents ? array() : $contents);
//cant use ternary shorthand "?:" in PHP 5.2, otherwise this would be shorter
}
Now, there is nothing wrong with the above code (at least I don't think there is and it works fine). However I'm always looking for ways to improve my code and condense it while keeping it perfectly legible. And that return statement has always bothered me because of how inefficient it seems. So today I got to thinking and something occurred to me. I remember seeing mysql tutorials that do something to the effect of connect() or die();
so I thought, why not json_decode() or array();
? Would this even work? So I rewrote my function to find out...
function load($file) {
if( ! file_exists($file)) {
$handle = fopen($file, 'w');
fclose($handle);
}
$raw = file_get_contents($file);
return json_decode($raw, TRUE) or array();
}
It seems to, and it even reads pleasantly enough. So on to my next bout of questions. Is this good practice? I understand it, but would anyone else? Does it really work or is this some bug with a happy ending? I got to looking around and found out that what I'm asking about is called short-circuit evaluation and not a bug. That was good to know. I used that new term to refine my search and came up with some more material.
There wasn't much and most everything I found that talked about using short-circuiting in the way I'm inquiring about always referred to MySQL connections. Now, I know most people are against using the or die()
terminology, but only because it is an inelegant way to deal with errors. This isn't a problem for the method I'm asking about because I'm not seeking to use or die()
. Is there any other reason not to use this? Wikipedia seems to think so, but only in reference to C. I know PHP is written in C, so that is definitely pertinent information. But has this issue been wheedled out in the PHP compilation? If not, is it as bad as Wikipedia makes it out to be?
Here's the snippet from Wikipedia.
Wikipedia - "Short-circuiting can lead to errors in branch prediction on modern processors, and dramatically reduce performance (a notable example is highly optimized ray with axis aligned box intersection code in ray tracing)[clarification needed]. Some compilers can detect such cases and emit faster code, but it is not always possible due to possible violations of the C standard. Highly optimized code should use other ways for doing this (like manual usage of assembly code)"
What do you all think?
EDIT
I've polled another forum and gotten some good results there. General consensus appears to be that this form of variable assignment, while valid, is not preferred, and may even be considered bad form in the real world. I'll continue to keep an ear to the ground and will update this if anything new comes around. Thank you Corbin and Matt for your input, especially Corbin for clearing up a few things. Here's a link to the forum post should you be interested.
There's a few different questions you ask, so I'll try to address them all.
Missed branch predictions: Unless you're coding in C or assembly, don't worry about this. In PHP, you're so far from the hardware that thinking about branch predictions isn't going to help you. Either way, this would be a very-micro optimization, especially in a function that does extensive string parsing to begin with.
Is there any other reason not to use this? Wikipedia seems to think so, but only in reference to C. I know PHP is written in C, so that is definitely pertinent information.
PHP likely parses it to a different execution structure. Unless you're planning on running this function millions of times, or you know it's a bottleneck, I wouldn't worry about it. In 2012, I find it very unlikely that using an or
to short circuit would cause even a billionth of a second difference.
As for the formatting, I find $a or $b
rather ugly. My mind doesn't comprehend the short circuiting the same it sees it in an if clause.
if (a() || b())
Is perfectly clear to my mind that b() will execute only if a() does not evaluate to true.
However:
return a() or b();
Doesn't have the same clarity to me.
That's obviously just an opinion, but I'll offer two alternatives as to how I might write it (which are, in my opinion, a very tiny bit clearer):
function load($file) {
if (!file_exists($file)) {
touch($file);
return array();
}
$raw = file_get_contents($file);
$contents = json_decode($raw, true);
if (is_array($contents)) {
return $contents;
} else {
return array();
}
}
If you don't care if the file actually gets created, you could take it a step farther:
function load($file) {
$raw = file_get_contents($file);
if ($raw !== false) {
$contents = json_decode($raw, true);
if ($contents !== null) {
return $contents;
}
}
return array();
}
I guess really these code snippets come down to personal preference. The second snippet is likely the one I would go with. The critical paths could be a bit clearer in it, but I feel like it maintains brevity without sacrificing comprehensibility.
Edit: If you're a 1-return-per-function type person, the following might be a bit more preferable:
function load($file) {
$contents = array();
$raw = file_get_contents($file);
if ($raw !== false) {
$contents = json_decode($raw, true);
if ($contents === null) {
$contents = array();
}
}
return $contents;
}