Disadvantages of top-down node splitting on insertion into B+ tree

For a B+ tree insertion why would you traverse down the tree then back upwards splitting the parents?

Wikipedia suggests this method of insertion:

Perform a search to determine what bucket the new record should go into.

If the bucket is not full (at most b - 1 entries after the insertion), add the record.

Otherwise, split the bucket.

Allocate new leaf and move half the bucket's elements to the new bucket.

Insert the new leaf's smallest key and address into the parent.

If the parent is full, split it too.

Add the middle key to the parent node.

Repeat until a parent is found that need not split.

If the root splits, create a new root which has one key and two pointers.

Why would you traverse down then tree and then go back up performing the splits? Why not split the nodes as you encounter them on the way down?

To me, the proposed method performs twice the work and requires more bookkeeping as well.

Can anyone explain why this is the preferred method for insertion as opposed to splitting on the way down and what the disadvantages are for inserting during the traversal?

Solution

You have to backtrack up the tree because you don't actually know whether a split is required at the lowest level until you get there.

It's all there in the phrase "If the bucket is not full, ...".

You should also be aware that it's nowhere near twice the work. Since you're remembering all sorts of stuff on the way down (node pointers, indexes within the node, and so on), there's not as much calculation or searching on the way back up.