Search code examples
azureazure-cognitive-search

How to create a field mapping in Azure Search with a complex targetField


I use the Azure Search indexer to index documents from a MongoDB CosmosDB which contains objects with fields named _id. As Azure Search does not allow underscores at the beginning of a field name in the index, I want to create a field mapping.

JSON structure in Cosmos --> structure in index

{
  "id": "test"
  "name": "test",
  "productLine": {
     "_id": "123",       --> "id": "123"
     "name": "test"
  }
}

The documentation has exactly this scenario as an example but only for a top level field.

"fieldMappings" : [ { "sourceFieldName" : "_id", "targetFieldName" : "id" } ]}

I tried the following:

"fieldMappings" : [ { "sourceFieldName" : "productLine/_id", "targetFieldName" : "productLine/id" } ] }

that results in an error stating:

Value is not accepted. Valid values: "doc_id", "name", "productName".

What is the correct way to create a mapping for a target field that is a subfield?


Solution

  • It's not possible to directly map subfields. You can get around this by adding a Skillset with a Shaper cognitive skill to the indexer, and an output field mapping.

    You will also want to attach a Cognitive Services resource to the skillset. The shaper skill doesn't get billed, but attaching a Cognitive Services resource allows you to process more than 20 documents per day.

    Shaper skill

    {
      "@odata.type": "#Microsoft.Skills.Util.ShaperSkill",
      "context": "/document",
      "inputs": [
        {
          "name": "id",
          "source": "/document/productLine/_id"
        },
        {
          "name": "name",
          "source": "/document/productLine/name"
        }
      ],
      "outputs": [
        {
          "name": "output",
          "targetName": "renamedProductLine"
        }
      ]
    }
    

    Indexer skillset and output field mapping

    "skillsetName": <skillsetName>,
    "outputFieldMappings": [
        {
            "sourceFieldName": "/document/renamedProductLine",
            "targetFieldName": "productLine"
        }
    ]