Search code examples
xmlloopsgroup-byxquery

Xquery group by on key fields and concatenate elements


I have a large well formated XML payload. I need to look through the XML and based on a key SupporterID and CampaignNumber, group the records and concatenate the CampaignID field with * delimeters. Along with retaining the 000123 value for CampaignData35.

In the input data there are 4 rows of data, I need them to be 1 row along with doing a case statement to lookup the values and return another set. I have tried the case statements seperately but get a sequence error...

Any help would be much appreciated.

INPUT Data below:

let $payload:= <results
    xmlns:aetgt="http://"
    xmlns:bconn="http://">
    <row>
        <AccountID>"2404"</AccountID>
        <SupporterID>"254356368"</SupporterID>
        <DateCreated>"2023-02-11"</DateCreated>
        <DateModified/>
        <CampaignNumber>"275077"</CampaignNumber>
        <CampaignType>"QCB"</CampaignType>
        <CampaignID>"Phone"</CampaignID>
        <CampaignData35>""</CampaignData35>
    </row>
    <row>
        <AccountID>"2404"</AccountID>
        <SupporterID>"254356368"</SupporterID>
        <DateCreated>"2023-02-11"</DateCreated>
        <DateModified/>
        <CampaignNumber>"275077"</CampaignNumber>
        <CampaignType>"QCB"</CampaignType>
        <CampaignID>"Email"</CampaignID>
        <CampaignData35>""</CampaignData35>
    </row>
    <row>
        <AccountID>"2404"</AccountID>
        <SupporterID>"254356368"</SupporterID>
        <DateCreated>"2023-02-11"</DateCreated>
        <DateModified/>
        <CampaignNumber>"275077"</CampaignNumber>
        <CampaignType>"QCB"</CampaignType>
        <CampaignID>"Post"</CampaignID>
        <CampaignData35>""</CampaignData35>
    </row>
    <row>
        <AccountID>"2404"</AccountID>
        <SupporterID>"254356368"</SupporterID>
        <DateCreated>"2023-02-11"</DateCreated>
        <DateModified/>
        <CampaignNumber>"275077"</CampaignNumber>
        <CampaignType>"QCB"</CampaignType>
        <CampaignID>"HELLO"</CampaignID>
        <CampaignData35>"000123"</CampaignData35>
    </row>
</results>

I want it to look like this:

<results
    xmlns:aetgt="http://"
    xmlns:bconn="http://">
    <row>
        <AccountID>"2404"</AccountID>
        <SupporterID>"254356368"</SupporterID>
        <DateCreated>"2023-02-11"</DateCreated>
        <DateModified/>
        <CampaignNumber>"275077"</CampaignNumber>
        <CampaignType>"QCB"</CampaignType>
        <CampaignID>"123*456*789"</CampaignID>
        <CampaignData35>"000123"</CampaignData35>
    </row>
    </row>
</results>

Thanks

I tried the following which gave me the sequence error:

let $post :=
  for $i in $payload//row
        return
          if ($i/CampaignID/text() eq '"Post"') then '123'
          else ''

let $email :=
  for $i in $payload//row
        return
           if ($i/CampaignID/text() eq '"Email"') then '456'
            else ''

let $phone :=
  for $i in $payload//row
        return
           if ($i/CampaignID/text() eq '"Phone"') then '789'
            else ''

return 
  $post || '*' || $email || '*' || $phone
            

EDIT: I managed to make a little progress:

let $x:= $payload
  for $z in distinct-values($x//row/SupporterID)
let $c := $x//row[SupporterID=$z]/CampaignID

    return
      concat('', string-join($c, "*"))

Got this result:

"Phone"*"Email"*"Post"*"HELLO"

But I need to rebuilt the XML to bring back the entire row and all fields just replacing the above.


Solution

  • The key in achieving your goal is to use the XQuery FLWOR expression's group by clause.

    Here's one solution, which performs the following steps, derived from your description:

    1. Groups the <row> elements by a key derived from the <SupporterID> and <CampaignNumber> elements.
    2. Constructs a single new <row> element for each group, containing all child elements of one <row> element (taking the 1st from the group for example), except for the <CampaignID> child element - since you want to manipulate this element's contents.
    3. Constructs a new <CampaignID> child element, whose value consists of the original <CampaignID> elements mapped onto different values, sorted, and delimited by an * and wrapped in quotes.
    for $row in $payload/row
    group by $key := $row/SupporterID || "_" || $row/CampaignNumber
    return 
        <row>{
            $row[1]/* except $row[1]/CampaignID, 
            <CampaignID>"{
                let $new-values :=
                    for $value in $row/CampaignID
                    return
                        switch ($value)
                            case '"Post"' return "123"
                            case '"Email"' return "456"
                            case '"Phone"' return "789"
                            default return ()
                return
                    string-join(sort($new-values), "*")
            }"</CampaignID>
        }</row>
    

    For more on the group by clause, see https://www.w3.org/TR/xquery-31/#id-group-by.