Search code examples
arraysalgorithmsortingamortized-analysis

Amortized time of insertion in sorted array is O(n) and deletion is O(1)?


I am learning how to analyze algorithms and I found the notation of "Amortized time". I found some predefined estimations like:

-Amortized time of insertion in a sorted array is: O(n)

And Amortized time of Deletion from a sorted array is: O(1)

Can anyone explain it to me in detail, please!


Solution

  • The idea is to associate with each entry in the array a Boolean called deleted. Deleting an item consists of setting deleted to true. When there are too many deleted items, compact them out. If you make your compaction threshold a fraction of the total size, you can pay for the compaction from all the deletions required to reach the compaction point.

    Here's a sketch. It's incomplete but demonstrates the insertion and deletion algorithms.

    class sorted_array
    {
    public:
        typedef std::vector<std::pair<int, bool>>::iterator iterator;
    
        iterator insert(int value)
        {
            auto item = std::make_pair(value, false);
            return vec.insert(std::lower_bound(vec.begin(), vec.end(), item), item);
        }
    
        void erase(iterator pos)
        {
            pos->second = true; // deleted = true
            deleted_count++;
            if (deleted_count * 2 > vec.size())
            {
               vec.erase(std::remove_if(vec.begin(), vec.end(),
                                        std::get<1, int, bool>), vec.end());
               deleted_count = 0;
            }
        }
    
    private:
        size_t deleted_count = 0;
        std::vector<std::pair<int, bool>> vec;
    }
    

    Insertion is O(n) as usual. When we insert the element, we also mark it as not deleted.

    To delete a element, we merely mark it as deleted and bank two credits.

    When more than half of the elements in the vector are deleted, then that means that we have at least as many credits as there are elements in the vector. That means we can afford to run the O(n) compaction.

    To find an element, you run a traditional binary search, and then skip over deleted elements. Since at most half of the elements are deleted, the binary search operates on at most 2n elements, which means that it runs in O(log 2n) = O(log n) steps. There's a little bit of extra cost skipping forward past deleted items after the binary search completes, but some more cleverness in the data structure can reduce that to a constant. (Left as an exercise.)

    Similarly, iterating over the collection will take at most 2n steps (because at most half of the elements are deleted), which is still O(n).