python python-3.x performance time-complexity space-complexity

Time and Space analysis in python

Can someone provide an example of O(log(n)) and O(nlog(n)) problems for both time and space?

I am quiet new to this type of analysis and can not see past polynomial time/space.

What I don't get is how can you be O(1) < O(log(n)) < O(n) is that like "semi-constant"?

Additionally, I would appreciate any great examples which cover these cases (both time and space):

I find space analysis a bit more ambiguous so it would be nice to see it compared to other cases from the time analysis in the same place - something I couldn't find reliably online.

Can you provide examples for each case in both space and time analysis?

Solution

Before examples, a little clarification on big O notation

Perhaps I'm mistaken, but seeing

What I don't get is how can you be O(1) < O(log(n)) < O(n) is that like "semi-constant"?

makes me think that you have been introduced to the idea of big-O notation as the number of operation to be carried (or number of bytes to be stored, etc), e.g. if you have a loop for(int i=0;i<n;++i) then there are n operations so the time complexity is O(n). While this is a nice first intuition, I think that it can be misleading as big-O notation defines a higher asymptotic bound.

Let's say that you have chosen an algorithm to sort an array of numbers, and let's denote x the number of element in that array, and f(x) the time complexity of that algorithm. Assume now that we say that the algorithm is O(g(x)). What this means is that as x grows, we will eventually reach a threshold x_t such that if x_i>x_t, then abs(f(x_i)) will always be lower or equal than alpha*g(x_i) where alpha is a postivie real number.

As a result, a function that is O(1) doesn't always take the same constant time, rather, you can be sure that no matter how many data it needs, the time it will take to complete its task will be lower than a constant amount of time, e.g. 5seconds. Similarly, O(log(n)) doesn't mean that there is any notion of a semi-constant. It just means that 1) the time the algorithm will take will depend on the size of the dataset that you feed it and 2) If the dataset is large enough (i.e. n is sufficiently large) then the time that it will take for it to complete is will always be less or equal than log(n).

Some examples regarding time complexity

O(1): Accessing an element from an array.
O(log(n)): binary search in an incrementally sorted array. Say you have an array of n elements and you want to find the index where the value is equal to x. You can start at the middle of the array, and if the value v that you read there is greater than x, you repeat the same process on the left side of v, and if it's smaller you look to the right side of v. You continue this process until the value you're looking for is found. As you can see, if you're lucky, you can find the value in the middle of the array on first try, or you can find it after log(n) operations. So there is no semi-constancy, and Big-O notation tells you the worst case.
O(nlogn): sorting an array using Heap sort. This is a bit too long to explain here.
O(n^2): computing the sum of all pixels on square gray-scale images (which you can consider as a 2d matrix of numbers).
O(n^3): naively multiplying two matrices of size n*n.
O(n^{2+epsilon}): multiplying matrices in smart ways (see wikipedia)
O(n!) naively computing a factorial.

Some examples regarding space complexity

O(1) Heapsort. One might think that since you need to remove variables from the root of the tree, you will need extra space. However, since a heap can just be implemented as an array, you can store the removed values at the end of said array instead of allocating new space.
An interesting example would be, I think, to compare two solutions to a classical problem: assume you have an array X of integers and a target value T, and that you are given the guarentee that there exist two values x,y in X such that x+y==T. You goal is to find those two values. One solution (known as two-pointers) would be to sort the array using heapsort (O(1) space ) and then define two indexes i,j that respectively point to the start and end of the sorted array X_sorted. Then, if X[i]+X[j]<T, we increment i and if X[i]+X[j]>T, we decrement j. We stop when X[i]+X[j]==T. It's obvious that this requires no extra allocations, and so the solution has O(1) space complexity. A second solution would be this:
```
D={}
for i in range(len(X)):
    D[T-X[i]]=i
for x in X:
    y=T-x
    if y in D:
        return X[D[y]],x
```
which has space complexity O(n) because of the dictionary.

The examples given for time complexity above are (except the one regarding efficient matrix multiplications) pretty straight-forward to derive. As others have said I think that reading a book on the subject is your best bet at understanding this topic in depth. I highly recomment Cormen's book.