I implemented in Red-Black trees in Python according to the pseudo code in Cormen's Introduction to Algorithms
.
I wanted to see in my own eyes that my insert
is really O(logn)
so I plotted the time it takes to insert n=1, 10, 20, ..., 5000
nodes into the tree.
This is the result:
the x
-axis is n
and the y
-axis is the time it took in milliseconds.
To me the graph looks more linear than logarithmic. What can explain that?
Ok, so the graph displays a measurement of the cost of inserting n
elements into your tree, where the x axis is how many elements we've inserted, and the y axis is the total time.
Let's call the function that totals the time it takes to insert n elements into the tree f(n)
.
Then we can get a rough idea of what f
might look like:
f(1) < k*log(1) for some constant k.
f(2) < k*log(1) + k*log(2) for some constant k
...
f(n) < k * [log(1) + log(2) + ... + log(n)] for some constant k.
Due to how logs work, we can collapse log(1) + ... + log(n)
:
f(n) < k * [log(1*2*3*...*n)] for some constant k
f(n) < k * log(n!) for some constant k
We can take a look at Wikipedia to see a graph of what log(n!)
looks like. Take a look at the graph in the article. Should look pretty familiar to you. :)
That is, I think you've done this by accident:
for n in (5000, 50000, 500000):
startTime = ...
## .. make a fresh tree
## insert n elements into the tree
stopTime = ...
## record the tuple (n, stopTime - startTime) for plotting
and plotted total time to construct the tree of size n, rather than the individual cost of inserting one element into a tree of size n:
for n in range(50000):
startTime = ...
## insert an element into the tree
stopTime = ...
## record the tuple (n, stopTime - startTime) for plotting
Chris Taylor notes in the comments that if you plot f(n)/n
, you'll see a log graph. That's because a fairly tight approximation to log(n!)
is n*log(n)
(see the Wikipedia page). So we can go back to our bound:
f(n) < k * log(n!) for some constant k
and get:
f(n) < k * n * log(n) for some constant k
And now it's should be easier to see that if you divide f(n)
by n
, your graph will be bounded above by the shape of a logarithm.