Search code examples
intervalscomputational-geometrytheory

Find a start and an ending of an interval which intersects with the table of others the most


Here's a task:

Curators of the summer flowers exhibition were preparing for the event. They had to find a time during which the biggest amount of flowers were blooming. They've had a table which contained (0<n<30) flowers and their properties:

ID, the start and the end of blooming season (StartMonth, StartDay, EndMonth, EndDay)

Help the curators find the earliest time during which most flowers bloom(the start and the end of the interval) for the exhibition.

Note: the right time does not necessarily have to be an already existing interval

Since this is an assignment I'd like to do the most of it by myself, I'd just like to get an advice on how to theoretically solve this problem.

A sample of the blooming seasons:

  • In a "txt" file:

121 6 15 7 25

102 7 1 8 14

236 6 30 8 31

141 7 31 8 10

111 7 1 7 20

128 6 2 6 3

  • Visual representation (Not the same data as in "txt")

enter image description here

To start with I'll introduce my approach:

  1. The first thing I did is converted the interval's Month and Day to one number format (example in a MM-DD: 08-15 I've changed to 30 + 31 + 15 = 76)
  2. After that, for the first time I check through all of the intervals and save separately one which has the most intersections.
  3. Later I check all the intervals again but this time I compare them specifically to the saved one, if they do not intersect with it they get deleted, in this way I reduce the noise.
  4. At this point I thought that all that was left to do was to find the smallest interval that intersects with the saved one. However, I was wrong. I've not taken into consideration that the smallest interval could have only had one intersection.
  5. After a while I've come to a realization that I need to find an interval which has the most intersections and is smallest out of the given set. But how can I find both?
  6. No doubt there are even more holes in my theory. I'm pretty sure that even if I pull it off it'll still have tons of exceptions.

Now thinking about it an idea popped in my mind is that i could try to just make a struct of 92 days and then count which days contain most blooming seasons. This would be pretty easy but I wouldn't be happy with this solution since it would be too inefficient. (Chuckles) What about when compared to my current code which matches intervals again and again. Would it still be inefficient? :D

I've also seached the internet for the answer but managed to find only the half of it.

All in all I'd love to hear other opinions. What are the ways to properly deal with this? Is my approach reasonable or should I look for another one? Your help would be greatly appreciated. Again I'm not asking for a full soliution, just nudge towards the answer. Thanks in advance.


Solution

  • Sort all interval bounds by increasing date, weighting them +1 for a beginning and -1 for an ending.

    Now scan the list in chronological order, adding the point weights as you go. Every time, this gives the number of different blooming flowers. In your case, 012345456... 0.

    At the same time, keep the bound giving the highest count so far (in case of ties, keep the earliest value). The end of the highest interval is simply the next bound.


    If several intervals begin or end simultaneously, process the ones ending first, to avoid zero-duration solutions. (To achieve this, it suffices to craft a comparison function such that in case of tie, -1 has precedence.)