Search code examples
c#javac++design-by-contract

How much null checking is enough?


What are some guidelines for when it is not necessary to check for a null?

A lot of the inherited code I've been working on as of late has null-checks ad nauseam. Null checks on trivial functions, null checks on API calls that state non-null returns, etc. In some cases, the null-checks are reasonable, but in many places a null is not a reasonable expectation.

I've heard a number of arguments ranging from "You can't trust other code" to "ALWAYS program defensively" to "Until the language guarantees me a non-null value, I'm always gonna check." I certainly agree with many of those principles up to a point, but I've found excessive null-checking causes other problems that usually violate those tenets. Is the tenacious null checking really worth it?

Frequently, I've observed codes with excess null checking to actually be of poorer quality, not of higher quality. Much of the code seems to be so focused on null-checks that the developer has lost sight of other important qualities, such as readability, correctness, or exception handling. In particular, I see a lot of code ignore the std::bad_alloc exception, but do a null-check on a new.

In C++, I understand this to some extent due to the unpredictable behavior of dereferencing a null pointer; null dereference is handled more gracefully in Java, C#, Python, etc. Have I just seen poor-examples of vigilant null-checking or is there really something to this?

This question is intended to be language agnostic, though I am mainly interested in C++, Java, and C#.


Some examples of null-checking that I've seen that seem to be excessive include the following:


This example seems to be accounting for non-standard compilers as C++ spec says a failed new throws an exception. Unless you are explicitly supporting non-compliant compilers, does this make sense? Does this make any sense in a managed language like Java or C# (or even C++/CLR)?

try {
   MyObject* obj = new MyObject(); 
   if(obj!=NULL) {
      //do something
   } else {
      //??? most code I see has log-it and move on
      //or it repeats what's in the exception handler
   }
} catch(std::bad_alloc) {
   //Do something? normally--this code is wrong as it allocates
   //more memory and will likely fail, such as writing to a log file.
}

Another example is when working on internal code. Particularly, if it's a small team who can define their own development practices, this seems unnecessary. On some projects or legacy code, trusting documentation may not be reasonable... but for new code that you or your team controls, is this really necessary?

If a method, which you can see and can update (or can yell at the developer who is responsible) has a contract, is it still necessary to check for nulls?

//X is non-negative.
//Returns an object or throws exception.
MyObject* create(int x) {
   if(x<0) throw;
   return new MyObject();
}

try {
   MyObject* x = create(unknownVar);
   if(x!=null) {
      //is this null check really necessary?
   }
} catch {
   //do something
}

When developing a private or otherwise internal function, is it really necessary to explicitly handle a null when the contract calls for non-null values only? Why would a null-check be preferable to an assert?

(obviously, on your public API, null-checks are vital as it's considered impolite to yell at your users for incorrectly using the API)

//Internal use only--non-public, not part of public API
//input must be non-null.
//returns non-negative value, or -1 if failed
int ParseType(String input) {
   if(input==null) return -1;
   //do something magic
   return value;
}

Compared to:

//Internal use only--non-public, not part of public API
//input must be non-null.
//returns non-negative value
int ParseType(String input) {
   assert(input!=null : "Input must be non-null.");
   //do something magic
   return value;
}

Solution

  • First note that this a special case of contract-checking: you're writing code that does nothing other than validate at runtime that a documented contract is met. Failure means that some code somewhere is faulty.

    I'm always slightly dubious about implementing special cases of a more generally useful concept. Contract checking is useful because it catches programming errors the first time they cross an API boundary. What's so special about nulls that means they're the only part of the contract you care to check? Still,

    On the subject of input validation:

    null is special in Java: a lot of Java APIs are written such that null is the only invalid value that it's even possible to pass into a given method call. In such cases a null check "fully validates" the input, so the full argument in favour of contract checking applies.

    In C++, on the other hand, NULL is only one of nearly 2^32 (2^64 on newer architectures) invalid values that a pointer parameter could take, since almost all addresses are not of objects of the correct type. You can't "fully validate" your input unless you have a list somewhere of all objects of that type.

    The question then becomes, is NULL a sufficiently common invalid input to get special treatment that (foo *)(-1) doesn't get?

    Unlike Java, fields don't get auto-initialized to NULL, so a garbage uninitialized value is just as plausible as NULL. But sometimes C++ objects have pointer members which are explicitly NULL-inited, meaning "I don't have one yet". If your caller does this, then there is a significant class of programming errors which can be diagnosed by a NULL check. An exception may be easier for them to debug than a page fault in a library they don't have the source for. So if you don't mind the code bloat, it might be helpful. But it's your caller you should be thinking of, not yourself - this isn't defensive coding, because it only 'defends' against NULL, not against (foo *)(-1).

    If NULL isn't a valid input, you could consider taking the parameter by reference rather than pointer, but a lot of coding styles disapprove of non-const reference parameters. And if the caller passes you *fooptr, where fooptr is NULL, then it has done nobody any good anyway. What you're trying to do is squeeze a bit more documentation into the function signature, in the hope that your caller is more likely to think "hmm, might fooptr be null here?" when they have to explicitly dereference it, than if they just pass it to you as a pointer. It only goes so far, but as far as it goes it might help.

    I don't know C#, but I understand that it's like Java in that references are guaranteed to have valid values (in safe code, at least), but unlike Java in that not all types have a NULL value. So I'd guess that null checks there are rarely worth it: if you're in safe code then don't use a nullable type unless null is a valid input, and if you're in unsafe code then the same reasoning applies as in C++.

    On the subject of output validation:

    A similar issue arises: in Java you can "fully validate" the output by knowing its type, and that the value isn't null. In C++, you can't "fully validate" the output with a NULL check - for all you know the function returned a pointer to an object on its own stack which has just been unwound. But if NULL is a common invalid return due to the constructs typically used by the author of the callee code, then checking it will help.

    In all cases:

    Use assertions rather than "real code" to check contracts where possible - once your app is working, you probably don't want the code bloat of every callee checking all its inputs, and every caller checking its return values.

    In the case of writing code which is portable to non-standard C++ implementations, then instead of the code in the question which checks for null and also catches the exception, I'd probably have a function like this:

    template<typename T>
    static inline void nullcheck(T *ptr) { 
        #if PLATFORM_TRAITS_NEW_RETURNS_NULL
            if (ptr == NULL) throw std::bad_alloc();
        #endif
    }
    

    Then as one of the list of things you do when porting to a new system, you define PLATFORM_TRAITS_NEW_RETURNS_NULL (and maybe some other PLATFORM_TRAITS) correctly. Obviously you can write a header which does this for all the compilers you know about. If someone takes your code and compiles it on a non-standard C++ implementation that you know nothing about, they're fundamentally on their own for bigger reasons than this, so they'll have to do it themselves.