Search code examples
xquerymarklogic

How to get the difference URI between 2 collection in marklogic


I have two collections .

I need to get difference uri between the two collections based on the file name.

Example Scenario :

Collection 1:

/data/1.xml
/data/2.xml
/data/3.xml


collection 2:

/test/1.xml
/test/2.xml
/test/3.xml
/test/4.xml
/test/5.xml

output:
/data/1.xml
/data/2.xml
/data/3.xml
/test/4.xml
/test/5.xml

Solution

  • Two alternative solutions:

    First approach, put each of the items in the map, using a consistent key(substring after the last slash), and then select the first item in the map for each key:

    let $x := (
      "/data/1.xml",
      "/data/2.xml",
      "/data/3.xml")
    let $y := (
      "/test/1.xml",
      "/test/2.xml",
      "/test/3.xml",
      "/test/4.xml",
      "/test/5.xml")
    let $intersection := map:map()
    let $_ := ($x, $y) ! ( 
      let $key := tokenize(., "/")[last()] 
      return 
        map:put($intersection, $key, (map:get($intersection, $key), .))
    )
    return 
      for $key in map:keys($intersection)
      for $uri in map:get($intersection, $key)[1]
      order by number(replace($uri, ".*/(\d+).xml", '$1'))
      return $uri
    

    Second approach, ensure that only the first item is set for a given key:

    let $x := (
      "/data/1.xml",
      "/data/2.xml",
      "/data/3.xml")
    let $y := (
      "/test/1.xml",
      "/test/2.xml",
      "/test/3.xml",
      "/test/4.xml",
      "/test/5.xml")
    
    let $intersection := map:map()
    let $_ := ($x, $y) ! ( 
      let $key := tokenize(., "/")[last()] 
      return 
        if (fn:exists(map:get($intersection, $key))) then ()
        else map:put($intersection, $key, .)
    )
    return 
      for $uri in map:get($intersection, map:keys($intersection))
      order by number(replace($uri, ".*/(\d+).xml", '$1'))
      return $uri
    

    The order by is optional, but with maps you may not have consistent ordering of the keys. Customize for what you need (i.e. /data/ uris first, and then /test/ uris, etc), or remove if you don't care about the order of the URIs.