Search code examples
functional-programmingxqueryesbosb

Removing consecutive numbers from a sequence in XQuery


XQuery

Input: (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)

Output: (1,7,14,17,24,28)

I tried to remove consecutive numbers from the input sequence using the XQuery functions but failed doing so

    xquery version "1.0" encoding "utf-8";

    declare namespace ns1="http://www.somenamespace.org/types";

    declare variable $request as xs:integer* external;

    declare function local:func($reqSequence as xs:integer*) as xs:integer* {

    let $nonRepeatSeq := for $count in (1 to count($reqSequence)) return
                          if ($reqSequence[$count+1] - $reqSequence) then
                          remove($reqSequence,$count+1)
                          else ()
    return
    $nonRepeatSeq
    };

    local:func((1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28))

Please suggest how to do so in XQuery functional language.


Solution

  • Two simple ways to do this in XQuery. Both rely on being able to assign the sequence of values to a variable, so that we can look at pairs of individual members of it when we need to.

    First, just iterate over the values and select (a) the first value, (b) any value which is not one greater than its predecessor, and (c) any value which is not one less than its successor. [OP points out that the last value also needs to be included; left as an exercise for the reader. Or see Michael Kay's answer, which provides a terser formulation of the filter; DeMorgan's Law strikes again!]

    let $vseq := (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
    for $v at $pos in $vseq
    return if ($pos eq 1
               or $vseq[$pos - 1] ne $v - 1
               or $vseq[$pos + 1] ne $v + 1)
           then $v 
           else ()
    

    Or, second, do roughly the same thing in a filter expression:

    let $vseq := (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
    return $vseq[
        for $i in position() return 
            $i eq 1 
            or . ne $vseq[$i - 1] + 1 
            or . ne $vseq[$i + 1] - 1]
    

    The primary difference between these two ways of performing the calculation and your non-working attempt is that they don't say anything about changing or modifying the sequence; they simply specify a new sequence. By using a filter expression, the second formulation makes explicit that the result will be a subsequence of $vseq; the for expression makes no such guarantee in general (although because for each value it returns either the empty sequence or the value itself, we can see that here too the result will be a subsequence: a copy of $vseq from which some values have been omitted.

    Many programmers find it difficult to stop thinking in terms of assignment to variables or modification of data structures, but its worth some effort.

    [Addendum] I may be overlooking something, but I don't see a way to express this calculation in pure XPath 2.0, since XPath 2.0 seems not to have any mechanism that can bind a variable like $vseq to a non-singleton sequence of values. (XPath 3.0 has let expressions, so it's not a challenge there. The second formulation above is itself pure XPath 3.0.)