Search code examples
data-structuresbrowser-history

data structure for sorted browsing history


Suppose i want to implement the browser history functionality. If i visit the the url for this first time it goes into the history , if i visit the same page again it comes up in the history list. lets say that i only display the top 20 sites, but i can choose to see history say for the last month , last week and so on .

what is the best approach for this ? i would use hash map for inserting / checking if it is visited earlier , but how do i sort efficiently for recently visited, i don't want to use tree map or tree set . also, how can i store history of weeks and months. Is it written on disk when browser closes ? and when i click clear history , how is the data structure deleted ?


Solution

  • This is in Java-ish code.

    You'll need two data structures: a hash map and a doubly linked list. The doubly linked list contains History objects (which contain a url string and a timestamp) in order sorted by timestamp; the hash map is a Map<String, History>, with urls as the key.

    class History {
      History prev
      History next
      String url
      Long timestamp
      void remove() {
        prev.next = next
        next.prev = prev
        next = null
        prev = null
      }
    }
    

    When you add a url to the history, check to see if it's in the hash map; if it is then update its timestamp, remove it from the linked list, and add it to the end of the linked list. If it's not in the hash map then add it to the hash map and also add it to the end of the linked list. Adding a url (whether or not it's already in the hash map) is a constant time operation.

    class Main {
      History first // first element of the linked list
      History last // last element of the linked list
      HashMap<String, History> map
    
      void add(String url) {
        History hist = map.get(url)
        if(hist != null) {
          hist.remove()
          hist.timestamp = System.currenttimemillis()
        } else {
          hist = new History(url, System.currenttimemillis())
          map.add(url, hist)
        }
        last.next = hist
        hist.prev = last
        last = hist
      }
    }
    

    To get the history from e.g. the last week, traverse the linked list backwards until you hit the correct timestamp.

    If thread-safety is a concern, then use a thread-safe queue for urls to be added to the history, and use a single thread to process this queue; this way your map and linked list don't need to be thread-safe i.e. you don't need to worry about locks etc.

    For persistence you can serialize / deserialize the linked list; when you deserialize the linked list, reconstruct the hash map by traversing it and adding its elements to the map. Then to clear the history you'd null the list and map in memory and delete the file you serialized the data to.

    A more efficient solution in terms of memory consumption and IO (i.e. (de)serialization cost) is to use a serverless database like SQLite; this way you won't need to keep the history in memory, and if you want to get the history from e.g. the last week you'd just need to query the database rather than traversing the linked list. However, SQLite is essentially a treemap (specifically a B-Tree, which is optimized for data stored on disk).