I've read that the best way to optimise rendering order of non-transparent objects in OpenGL 2 (especially ES) is to prioritise the avoidance of context changes (binding different buffers, shader programs etc) over depth sorting.
If you do something like call glBindBuffer with a buffer that's already bound, or glUseProgram with a shader program that's already the current program, etc etc, will they still cause an inefficient pipeline flush, or are the libraries clever enough to recognise them as NOOPs? It will make my code simpler if I can just bind everything at the moment it's needed without having to keep track of what's already bound and check against it.
Maybe. This really can't be answered in general. It's completely implementation dependent.
Whether drivers should check for redundant state changes is a somewhat philosophical discussion, and you won't find consensus on it. Therefore, you should expect different vendors to handle it differently, and I wouldn't even necessarily assume that it's handled consistently for all the state in the same driver.
If you're targeting specific platforms, you should measure it. Fortunately this is fairly easy to benchmark. If you want to cover a wide range of platforms/vendors, I would minimize redundant state changes. At least if you have the option to do that relatively cheaply. If you add a lot of overhead for this, you may do more harm than good.
The main reason why there are mixed opinions on this is that checking for redundant state changes is not entirely free. If the driver does this, the overhead applies to everybody. So well written applications (that do not make unnecessary state changes) pay a price for optimizations that benefit poorly written applications. Which you could argue is very unfair.
In reality, these checks are often done, particularly if the state change itself is fairly expensive. Of course it's not worth adding a check if the state change is very cheap. The checks will often be driven by performance optimizations for important app/game benchmarks. Unfortunately many apps/games use OpenGL very inefficiently, and the driver has to produce the best possible result for important benchmarks. Filtering out redundant state changes is a common optimization in these cases.