Search code examples
azureazure-cognitive-search

Azure Search, mapping, merge collections


I have the following data :

From SELECT c.addresses[0] address, [ c.name ] filenames FROM c

[
  {
    "address": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "filenames": [
      "File 01.docx"
    ]
  },
  {
    "address": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "filenames": [
      "File 02.docx"
    ]
  },
  {
    "address": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "filenames": [
      "File 03.docx"
    ]
  }, ....

The address field is the key, I have an index with a field defined as follows :

new Field()
{
    Name = "filenames",
    Type = DataType.Collection(DataType.String),
    IsSearchable = true,
    IsFilterable = true,
    IsSortable = false,
    IsFacetable = false
},

As you can see, I create an array for the filenames with [ c.name ] filenames.

When I index the data displayed above, the index contains one row in the filenames collection, that row is the last one that has been indexed. Can I make it add to the collection (merge) rather than replace?

I am also looking at solving this with the Query, but CosmosDB does not support a subselect (yet) and a UDF can only see the data that's passed into it.


Solution

  • Fundamentally, the way you have structured your Cosmos DB collection makes this scenario unworkable because Azure search does not support merging into a collection.

    Consider changing your design to so that address is a key (that is, unique) in the collection, and all filenames are gathered in a single document per address:

      {
        "address": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
        "filenames": [ "File 01.docx", "File 02.docx", "File 03.docx", ... ]
      }
    

    Also, please add a suggestion on Azure Search UserVoice site to add support for merging collections, which would make your scenario easier to achieve.