Search code examples
sqloraclealgorithmrangecellular-network

number range format conversion


Given a range of cellular phone numbers in a form of ['7810000000', '7819999999'] I need an algorithm that would generate multiple rows of smallest possible length which would entierely cover the given range if postfixed with '%'. For instance, the range from above would be represented as a single row '781'. In other words any number from the range can be represented as 781%. This representaton can be usefull for storing tarrifs, for instance. A single row in a database could be used to rate the entire range. There are many other tasks for which it's preferrable to have this format. An alghorithm for the range of ['526251630000','526251634999'] would yeild

52625160
52625161
52625162
52625163
52625164

For the range ['12300345','12367000'] we should get

12300345
12300346
12300347
12300348
12300349
1230035
1230036
1230037
1230038
1230039
123004
123005
123006
123007
123008
123009
12301
12302
12303
12304
12305
12306
12307
12308
12309
1231
1232
1233
1234
1235
12361
12362
12363
12364
12365
12366
12367000

We need this conversion to be done in Oracle(SQL/PLQSL). Any information or links would be highly appreciated. Thank you in advance.


Solution

  • Brute-force approach, with range start/end as bind variables:

    var start_val varchar2(14);
    var end_val varchar2(14);
    
    exec :start_val := '526251630000';
    exec :end_val := '526251634999';
    
    with dns (dn) as (
      select to_number(:start_val) + level - 1
      from dual
      connect by level <= to_number(:end_val) - to_number(:start_val) + 1
    ),
    exponents (ex) as (
      select level - 1 from dual
      connect by level <= length(:start_val)
    ),
    tmp (dn, ex, pow, root, cnt ) as (
      select d.dn,
        e.ex,
        power(10, e.ex),
        trunc(d.dn / power(10, e.ex)),
        count(d.dn) over (partition by trunc(d.dn / power(10, e.ex)))
      from dns d
      cross join exponents e
    )
    select distinct to_char(min(root) keep (dense_rank first order by ex desc)
        over (partition by dn)) as result
    from tmp
    where pow = cnt
    order by result;
    

    which gets

    RESULT                                  
    ----------------------------------------
    526251630
    526251631
    526251632
    526251633
    526251634
    

    The dns CTE expands the range into all the values it contains. The exponents CTE generates all possible powers of 10 you might want to use (i.e. how many digits to lop off the right of the range value). The tmp CTE then cross-joins those and counts how many individual numbers appear in each 'sub-range' calculated by dividing by a power of 10. The final query then filters on whole sub-ranges - where the number of values in that range is the same as the size of the sub-range, so there are no missed splits - and finds the smallest (i.e. shortest) sub-range root for each DN. And then finally gets the distinct values of those.

    For the second range:

    exec :start_val := '12300345';
    exec :end_val := '12367000';
    

    the same query gets:

    RESULT                                  
    ----------------------------------------
    12300345
    12300346
    12300347
    12300348
    12300349
    1230035
    1230036
    1230037
    1230038
    1230039
    123004
    123005
    123006
    123007
    123008
    123009
    12301
    12302
    12303
    12304
    12305
    12306
    12307
    12308
    12309
    1231
    1232
    1233
    1234
    1235
    12360
    12361
    12362
    12363
    12364
    12365
    12366
    12367000
    

    The CTEs and cross-join mean this is doing a lot of work, and it will slow down as the ranges get bigger - possibly to the point it fails.

    I'm sure there are more efficient ways to do this and hopefully someone will come up with one, but this might get you started.