c++performance algorithm data-structures tree-structure

Duplicate / multiplicate tree efficiently

I need (for g++) a (computation time) optimized algorithm tree structure to duplicate/multiplicate a tree.

My tree will be a k-ary tree, but not necessarily filled. The main operation is to multiplicate (up to k times) the existing tree and add the trees as subtrees to a new node. Then the leaf node level will be erased to hold the fixed-level rule.

Does anybody know of a data structure offering this?

An example for the multiplication: Suppose we have a binary tree

      A
      |
     / \
    /   \
   B     C
   |    / \
   |   /   \
   D   E    F

and we want to add a new node / multiply like

    R
   / \
  /   \
 ..   ..

So the result will look like

            R
           / \
          /   \
         /     \
        /       \
       /         \
      A           A
      |           |            
     / \         / \        
    /   \       /   \        
   B     C      B     C        
   |    / \     |    / \        
   |   /   \    |   /   \    
   D   E    F   D   E    F

I tried to organize this on a std::vector in a heap-like structure, but multiplying the tree is still kind of slow, because I have to copy each tree level by itself rather than just copying the whole tree at once.

Solution

When you add R, it is trivial to give it 2 pointers to A, rather than copying the entire subtree starting at A.

   R
  / \
  | |
  \ /
   A
   |           
  / \
 /   \
B     C
|    / \
|   /   \
D   E    F

This is both very fast and very easy to code.

Now, the hitch in this comes in if you later want to update one side of the tree, but not the other. For example, perhaps you want to change the "right" F to a G. At that point you can use a copy-on-write strategy on only certain of the nodes, in this case leading to

      R
     / \
    /   \
   A    A <-- copied, left side points to B
   |   / \        
  / \  *  \  
 /   \     \
B     C     C <-- copied, left side points to E
|    / \   / \
|   /   \  *  \
D   E    F     G

Basically, you only need to copy the path from the point of the change (F/G) up to either the root (easiest to implement) or up to the highest node that is shared (A in this example).