Search code examples
xmlxqueryexist-db

XQuery: "order by" doesn't work in "for-each-pair"


I have a collection 1 of xml files. The typical xml files in this collection is as follows:

Typical XML files from Collection 1

<book xml:id="1">
    <body>
        <collection>Name of collection 1</collection>
        <title>Title of book</title>
        <author>Author of book</author>
        <date>Date of publication of book</date>
        <text>This is a text of book .....</text>
    </body>
</book>

I query these files like this (I have to use XQuery and cannot use XSL here, it is beyond my will. My whole query is much more complex, here I use a simple version to show, at best, my code and my problem):

XQuery for XML files from Collection 1

for $book in collection("path_to_collection_1")
    let $collection_name := data($book/collection)
    let $title := data($book/title)
    let $author := data($book/author)
    let $date := data($book/date)

    where $date >= '2000'
    order by $date
return
<book>
    <collection_name>{$collection_name}</collection_name>
    <title>{$title}</title>
    <author>{$author}</author>
    <date>{$date}</date>
</book>

Result of XQuery for XML files from Collection 1 (pay attention to date)

1
<book>
    <collection_name>Collection 1</collection_name>
    <title>This is a best novel in the word</title>
    <author>Lebowski, Big</author>
    <date>2001</date>
</book>

2
<book>
    <collection_name>Collection 1</collection_name>
    <title>Another greate book</title>
    <author>Lincoln, Abraham</author>
    <date>2005</date>
</book>

3
<book>
    <collection_name>Collection 1</collection_name>
    <title>The story of my life</title>
    <author>Mouse, Mickey</author>
    <date>2012</date>
</book>
    etc. etc. etc.

Ok, it works well.

Also, I have a collection 2. The XML files from this second collection is basically the same as files from collection 1, but there is some little difference in the structure or in the name of some nodes, so I have to use a different query.

Typical XML files from Collection 2

<collection>
    <this_is_book>
        <data_about_book>
            <name_of_collection>Name of collection 2</name_of_collection>
            <title>Title of book</title>
            <author>Author of book</author>
            <date_of_publication>Date of publication of book</date_of_publication>
        </data_about_book>
        <text_of_book>
            <text>This is a text of book.....</text>
        </text_of_book>
    </this_is_book>
</collection>

XQuery for XML files from Collection 2

for $book in collection("path_to_collection_2")
    let $collection_name := data($book//name_of_collection)
    let $title := data($book//title)
    let $author := data($book//author)
    let $date := data($book//date_of_publication)

    where $date >= '2000'
    order by $date
return
<book>
    <collection_name>{$collection_name}</collection_name>
    <title>{$title}</title>
    <author>{$author}</author>
    <date>{$date}</date>
</book>

Result of XQuery for XML files from Collection 2 (pay attention to date)

1
<book>
    <collection_name>Collection 2</collection_name>
    <title>Why I crossed the Rubicon river</title>
    <author>Caesar, Julius</author>
    <date>2003</date>
</book>

2
<book>
    <collection_name>Collection 2</collection_name>
    <title>Colonize it!</title>
    <author>Musk, Elon</author>
    <date>2007</date>
</book>

3
<book>
    <collection_name>Collection 2</collection_name>
    <title>I have no more imagination</title>
    <author>Lennon, John</author>
    <date>2011</date>
</book>
    etc. etc. etc.

Ok, it works well.

My first problem was to combine the results from two queries (I have to able to query two collections in the same XQuery). To achieve this I use function for-each-pair (this was a subject of another question of mine here, many thanks to Martin Honnen for his answer and suggestions).

So, the simple schema of my current code is as follows:

for-each-pair(
  collection("path_to_collection_1"),
  collection("path_to_collection_2"),
  function($collection1, $collection2) {
    for $book in $collection1
        ... etc. (see above the whole query for 'collection 1')
    ,
    for $book in $collection2
        ... etc. (see above the whole query for 'collection 2')
  }
)

My current result (pay attention to date)

1
<book>
    <collection_name>Collection 1</collection_name>
    <title>Another greate book</title>
    <author>Lincoln, Abraham</author>
    <date>2005</date>
</book>

2
<book>
    <collection_name>Collection 1</collection_name>
    <title>This is a best novel in the word</title>
    <author>Lebowski, Big</author>
    <date>2001</date>
</book>

3
<book>
    <collection_name>Collection 1</collection_name>
    <title>The story of my life</title>
    <author>Mouse, Mickey</author>
    <date>2012</date>
</book>

4
<book>
    <collection_name>Collection 2</collection_name>
    <title>Colonize it!</title>
    <author>Musk, Elon</author>
    <date>2007</date>
</book>

5
<book>
    <collection_name>Collection 2</collection_name>
    <title>Why I crossed the Rubicon river</title>
    <author>Caesar, Julius</author>
    <date>2003</date>
</book>

6
<book>
    <collection_name>Collection 2</collection_name>
    <title>I have no more imagination</title>
    <author>Lennon, John</author>
    <date>2011</date>
</book>
    etc. etc. etc.

So results from the second query is appended to the results from the first one. But there is a problem: the order by $date clause doesn't work anymore...

My questions and desired result:

  1. At least. How to apply order by clause inside of each result? At least, I would like to have result from collection 1 ordered by date + result from collection 2 ordered by date.

  2. At best. Is it possible to apply order by clause to whole result (i.e. to combined result from collections 1 and 2)? At best, I would like to have result from collection 1 + result from collection 2 and then order all this by date)? Or, for this, I have to completely change my approach and use other things than for-each-pair function? In this case, which one?

An example of desired (at best) result. (pay attention to date and to collection_name)

1
<book>
    <collection_name>Collection 1</collection_name>
    <title>This is a best novel in the word</title>
    <author>Lebowski, Big</author>
    <date>2001</date>
</book>

2
<book>
    <collection_name>Collection 2</collection_name>
    <title>Why I crossed the Rubicon river</title>
    <author>Caesar, Julius</author>
    <date>2003</date>
</book>

3
<book>
    <collection_name>Collection 1</collection_name>
    <title>Another greate book</title>
    <author>Lincoln, Abraham</author>
    <date>2005</date>
</book>

4
<book>
    <collection_name>Collection 2</collection_name>
    <title>Colonize it!</title>
    <author>Musk, Elon</author>
    <date>2007</date>
</book>

5
<book>
    <collection_name>Collection 2</collection_name>
    <title>I have no more imagination</title>
    <author>Lennon, John</author>
    <date>2011</date>
</book>

6
<book>
    <collection_name>Collection 1</collection_name>
    <title>The story of my life</title>
    <author>Mouse, Mickey</author>
    <date>2012</date>
</book>
    etc. etc. etc.

Many thanks in advance for any help!

EDIT 1: I use the order by date as an example. In real code, I must be able to order by different elements (chosen by user) as order by title, order by author, etc. (But I don't need to apply a many different orders by at the same time.) So, maybe the solution should be more "generic" and not be particularly attached only to order by date?


Solution

  • Actually it was a quite simple. Just need to put the combined result from for-each-pair as sequence in a variable. And after loop through all items of this sequence and sort them by whatever you want. And voilà!

    There are the simple schema of this approach:

    let $results_not_sorted := (
        for-each-pair (
          collection("path_to_collection_1"),
          collection("path_to_collection_2"),
          function($collection1, $collection2) {
            for $book in $collection1
                ... etc. (see above the whole query for 'collection 1')
            ,
            for $book in $collection2
                ... etc. (see above the whole query for 'collection 2')
          }
        )
    ) 
    for $result in $results_not_sorted
        order by $result//date (:sort here by whatever you want)
    return
        $result
    

    And the detailed code:

    let $results_not_sorted := (
        for-each-pair (
          collection("path_to_collection_1"),
          collection("path_to_collection_2"),
          function($collection1, $collection2) {
            for $book in $collection1
                let $collection_name := data($book/collection)
                let $title := data($book/title)
                let $author := data($book/author)
                let $date := data($book/date)
                where $date >= '2000'
            return
                <book>
                    <collection_name>{$collection_name}</collection_name>
                    <title>{$title}</title>
                    <author>{$author}</author>
                    <date>{$date}</date>
                </book>
            ,
            for $book in $collection2
                let $collection_name := data($book//name_of_collection)
                let $title := data($book//title)
                let $author := data($book//author)
                let $date := data($book//date_of_publication)
                where $date >= '2000'
            return
                <book>
                    <collection_name>{$collection_name}</collection_name>
                    <title>{$title}</title>
                    <author>{$author}</author>
                    <date>{$date}</date>
                </book>
          }
        )
    ) 
    for $result in $results_not_sorted
        order by $result//date (:sort here by whatever you want)
    return
        $result