Search code examples
javadata-structuresbinary-treetreemap

How to use a binary tree with multiple duplicated key values or other alternative


I am working on a project using Java where I am retrieving data about movies from a movie store website. So I`m given a rating number x (integer from 0 to 5) and a category y ("horror", for example). I have to return movies with minimum rating x from the y category. On top of that I must use a data structure that would not iterate through every movie to retrieve this data.

So firstly I thought of implementing a binary tree to every category, where the key would be the rating (from 0 to 5). So in this situation I would have some duplicated keys, but the values would be added anyway to the tree. It seemed to work at first, but I got stucked in something: if I want to retrieve movies with minimum rating x and there is just one movie with the rating x (it is a leaf of the tree), I can not find a way to retrieve the others >= x movies.

I also tried TreeMap from Java. But it does not allow duplicated keys, so it would not work on this project.

Can someone help me? I would love some suggestions on how to solve the issue on the binary tree. Or other alternative.


Solution

  • Binary trees, in general, may or may not support duplicate keys, it depends on the implementation details. If you are using an implementation which does not support duplicate keys, such as the TreeMap in Java, you have still several options.

    Use multimaps

    There is a structure called Multimap which allows storing multiple values per key. You can store a tree of categories, as you do now, and then a multimap by rating for every category (instead of a map, as you are doing now). There is no multimap implementation in the standard java collections, but there are zillons of open-source implementations available. For example Google Guava Multimap.

    Use a composite key

    Store all movies in a single TreeMap, using a composite key - a triple of (Category, Rating, GeneratedUniqueIntegerForDisambiguation). The comparator would compare first the category, if they are the same, then the rating, and if they are still the same, then the generated int.

    Treemap supports efficient interval queries subMap(fromKey, toKey). So the horros with rating at least 3 could be retrieved like this:

    map.submap(new MyKey("horror", 3, Integer.MIN_VALUE), new MyKey("horror", 5, Integer.MAX_VALUE))