Is the following a common data type (i.e. does it have a name)?
Its unique characteristic is, unlike a regular Set
, that it contains the "universe" on initialisation with O(C)
memory overhead, and a max memory overhead of O(N/2)
(which only occurs when you remove every-other element):
> s = new Structure(701)
s = Structure(0-700)
> s.remove(100)
s = Structure(0-99, 101-700)
> s.add(100)
s = Structure(0-700)
> s.remove(200)
s = Structure(0-199, 201-700)
> s.remove(202)
s = Structure(0-199, 201, 203-700)
> s.removeAll()
s = Structure()
Does something like this have a standard name?
It's a Compressed Bit Set
or Compressed Bitmap
.
A Bit Set
or Bitmap
is a set specifically designed for storing Integer
s. Most languages offer standard implementations of these. They typically work by assigning a 1
to the N
th bit in an internal array of Integers
where N
is the number you're adding to the set. 0
indicates the value is not present. The memory usage for these types of Bit Sets
is dictated by the largest number you store.
A Compressed Bit Set
is one that compacts ranges of 0
s and 1
s.
In this case, the question demonstrates a type of compression called "run-length-encoding" (thank you @Ralf Kleberhoff), so it is specifically a Run-length Encoded Bitmap
.
Common implementations of Compressed Bitmaps
(from newest-to-oldest) are: