Search code examples
algorithmhashheap

How to find the most frequent element in a stream of numbers with additions and deletions allowed


I am new to the forum but I have read the guidelines and checked for duplicates( This is the closest I found :(How to find the most frequent word in a word stream?) but this searches for numbers that occur more than 51 % of the time. Please point me to a duplicate if it already exists.

So my questions is given a stream of numbers, find the number that occurs most frequently. For example : 2,3,4,2,5 : Ans = 2. This is easy but what happens if I can delete and add new numbers. Example : 2,3,5,3,4,2,2 : Max = 2 Delete(2) : Max = 2 ; Delete(2) : Max = 3 ...

I have thought of a Max heap Along with a Hash Table that contains pointers to each node in the heap so that updation is O(log n) and finding the max is O(1). Is there a better solution ?


Solution

  • If you are interested mostly in fast updates, you can use any data structure that associates keys (the integers) with values (each integer's current appearance count). "Adding" and "removing" integers will be simply handled by incrementing and decrementing the appearance count.

    For containers based on binary trees the operations would be O(logN) while for a hashtable it would be O(1). In each case "find the maximum" would be O(N).

    If you are interested mostly in fast "find the maximum" then your proposed solution is as good as it gets.