Search code examples
pythondata-structures

Representing a Range of values in Python


I'm interested in representing a range, similar to Guava's Range type, in Python. Specifically, it should have a start and end, and represent all values between the two (as a first pass, I'm fine with only representing the canonical open-closed range, i.e. [5,10), but proper representation of any open/closed range would be a reasonable feature).

I'm aware of the range() builtin, but my intent is to support arbitrary types (or specifically dates, for my use case).

Looking at Python's type hierarchy, it seems a range could be a Sequence or Set type fairly logically, but I'm unsure which makes more sense, of if it would be better to forgo shoehorning my class into that hierarchy and simply implement the behavior I want.

As a Sequence:

  • Fits the spec fairly well, it's a "finite ordered set".
  • A range can be counted, sliced, and iterated over.
  • However I potentially want to support unbounded ranges, e.g. [0,+∞), so maybe the above isn't true.

As a Set:

  • Slightly less to-spec, as a range is explicitly ordered
  • Conceptually more like a range, as set-theoretic operations like intersection and union make more sense
  • Properly represents that contains checks are efficient

As a separate structure:

  • We lose the benefits of following the patterns the above types (we'd have to define a separate range.slice() method, for instance)
  • But we're more explicit that this structure should not be confused with these types either. The fact that Guava's Range doesn't extend from the Collection API seems to back this argument up.

I'm curious what seems most Pythonic here, and if anyone's made any such data structures themselves.


Solution

  • Here's the implementation I've come up with so far. A Range object represents an arbitrary openClosed range, and is hash-able, contain-able, and iter-able, but is neither a sequence nor a set. The DateRange subclass represents ranges of dates, which primarily simply requires defining the increment argument as timedelta(days=1) rather than simply 1.

    class Range:  
      '''
      Represents a range, in the spirit of Guava's Range class.
      Endpoints can be absent, and (presently) all ranges are openClosed.
      There's little reason to use this class directly, as the range()
      builtin provides this behavior for integers.
      '''
      def __init__(self, start, end, increment=1):
        if start and end and end < start:
          raise ValueError("End date cannot be before start date, %s:%s" % (start,end))
        self.start = start
        self.end = end
        self.increment = increment
    
      def __repr__(self):
        return '[%s\u2025%s)' % (
          self.start or '-\u221E',
          self.end   or '+\u221E'
        )
    
      def __eq__(self, other):
        return self.start == other.start and self.end == other.end
    
      def __hash__(self):
        return 31*hash(self.start) + hash(self.end)
    
      def __iter__(self):
        cur = self.start
        while cur < self.end:
          yield cur
          cur = cur + self.increment
    
      def __contains__(self, elem):
        ret = True
        if self.start:
          ret = ret and self.start <= elem
        if self.end:
          ret = ret and elem < self.end
        return ret
    
    class DateRange(Range):
      '''A range of dates'''
      one_day = timedelta(days=1)
    
      @staticmethod
      def parse(daterange):
        '''Parses a string into a DateRange, useful for
        parsing command line arguments and similar user input.
        *Not* the inverse of str(range).'''
        start, colon, end = daterange.partition(':')
        if colon:
          start = strToDate(start) if start else None
          end = strToDate(end) if end else None
        else:
          start = strToDate(start)
          end = start + DateRange.one_day
        return DateRange(start, end)
    
      def __init__(self, start, end):
        Range.__init__(self, start, end, DateRange.one_day)
    
    def strToDate(date_str):
      '''Parses an ISO date string, such as 2014-2-20'''
      return datetime.datetime.strptime(date_str, '%Y-%m-%d').date()
    

    Some usage examples:

    >>> DateRange(datetime.date(2014,2,20), None)
    [2014-02-20‥+∞)
    >>> DateRange(datetime.date(2014,1,1), datetime.date(2014,4,1))
    [2014-01-01‥2014-04-01)
    >>> DateRange.parse(':2014-2-20')
    [-∞‥2014-02-20)
    >>> DateRange.parse('2014-2-20:2014-3-22')
    [2014-02-20‥2014-03-22)
    >>> daterange = DateRange.parse('2014-2-20:2014-3-2')
    >>> daterange
    [2014-02-20‥2014-03-02)
    >>> datetime.date(2014,1,25) in daterange
    False
    >>> datetime.date(2014,2,20) in daterange
    True
    >>> list(daterange)
    [datetime.date(2014, 2, 20), datetime.date(2014, 2, 21), datetime.date(2014, 2, 22),
     datetime.date(2014, 2, 23), datetime.date(2014, 2, 24), datetime.date(2014, 2, 25),
     datetime.date(2014, 2, 26), datetime.date(2014, 2, 27), datetime.date(2014, 2, 28),
     datetime.date(2014, 3, 1)]