Search code examples
regexalgorithmrangedigit

Generate Regex Digit Ranges from Sorted List of Digits


Suppose I have some sorted lists of integers and I want to convert them to their respective regex digit ranges, like so:

  1. [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] => [0-9]
  2. [0, 1, 2, 3, 4, 6, 7, 8, 9] => [0-46-9]
  3. [0, 1, 3, 4, 5, 8, 9] => [013-589]
  4. [0, 2, 4, 6, 8] => [02468]

I am not trying to regex match anything here. I am trying to generate a regex range from a set of digits.

I am really just looking to see if there is already some de facto algorithm for doing something like this.

Edit: Based on @Jerry_Coffin's answer, a Java-based algorithm:

List<Integer> digits = Arrays.asList(0, 1, 3, 4, 5, 8, 9);
StringBuilder digitRange = new StringBuilder().append('[');
int consecutive = 0;
for (int i = 0; i < digits.size(); i++) {
  if (i == digits.size() - 1 || digits.get(i) + 1 != digits.get(i + 1)) {
    if (consecutive > 1) {
        digitRange.append('-');
    }
    digitRange.append(digits.get(i));
    consecutive = 0;
  } else {
    if (consecutive == 0) {
      digitRange.append(digits.get(i));
    }
    consecutive++;
  }
}
digitRange.append(']');
System.out.println(digitRange.toString());

Output: [013-589]

Feel free to find improvements or problems.


Solution

  • Presumably you're starting from sorted input (if not, you almost certainly want to start by sorting the input).

    From there, start from the first (unprocessed) item, write it out. Walk through the numbers as long as they're consecutive. Assuming you get more than two consecutive, write out a dash then the last of the consecutive numbers. If you got two or fewer consecutive, just write them to output as-is.

    Repeat until you reach the end of the input.