Search code examples
azureazure-cognitive-search

How to index complex types into Edm.ComplexType with Azure Cognitive Search


I am indexing data into an Azure Search Index that is produced by a custom skill. This custom skill produces complex data which I want to preserve into the Azure Search Index.

Source data is coming from blob storage and I am constrained to using the REST API without a very solid argument for using the .NET SDK.

Current code

The following is a brief rundown of what I currently have. I cannot change the index's field or the format of data produced by the endpoint used by the custom skill.

Complex data

The following is an example of complex data produced by the custom skill (in the correct value/recordId/etc. format):

{
  "field1": 0.135412,
  "field2": 0.123513,
  "field3": 0.243655
}

Custom skill

Here is the custom skill which creates said data:

{
  "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
  "uri": "https://myfunction.azurewebsites.com/api,
  "httpHeaders": {},
  "httpMethod": "POST",
  "timeout": "PT3M50S",
  "batchSize": 1,
  "degreeOfParallelism": 5,
  "name": "MySkill",
  "context": "/document/mycomplex
  "inputs": [
    {
      "name": "text",
      "source": "/document/content"
    }
  ],
  "outputs": [
    {
      "name": "field1",
      "targetName": "field1"
    },
    {
      "name": "field2",
      "targetName": "field2"
    },
    {
      "name": "field3",
      "targetName": "field3"
    }
  ]
}

I have attempted several variations, notable using the ShaperSkill with each field as an input and the output "targetName" as "mycomplex" (with the appropriate context).

Indexer

Here is the indexer's output field mapping for the skill:

{
  "sourceFieldName": "/document/mycomplex,
  "targetFieldName": "mycomplex"
}

I have tried several variations such as "sourceFieldName": "/document/mycomplex/*.

Search index

And this is the targeted index field:

{
  "name": "mycomplex",
  "type": "Edm.ComplexType",
  "fields": [
    {
      "name": "field1",
      "type": "Edm.Double",
      "retrievable": true,
      "filterable": true,
      "sortable": true,
      "facetable": false,
      "searchable": false
    },
    {
      "name": "field2",
      "type": "Edm.Double",
      "retrievable": true,
      "filterable": true,
      "sortable": true,
      "facetable": false,
      "searchable": false
    },
    {
      "name": "field3",
      "type": "Edm.Double",
      "retrievable": true,
      "filterable": true,
      "sortable": true,
      "facetable": false,
      "searchable": false
    }
  ]
}

Result

My result is usually similar to Could not map output field 'mycomplex' to search index. Check your indexer's 'outputFieldMappings' property..


Solution

  • This may be a mistake with the context of your skill. Instead of setting the context to /document/mycomplex, can you try setting it to /document? You can then add a ShaperSkill with the context also set to /document and the output field being mycomplex to generate the expected complex type shape

    Example skills:

    "skills":
    [
    {
      "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
      "uri": "https://myfunction.azurewebsites.com/api,
      "httpHeaders": {},
      "httpMethod": "POST",
      "timeout": "PT3M50S",
      "batchSize": 1,
      "degreeOfParallelism": 5,
      "name": "MySkill",
      "context": "/document"
      "inputs": [
        {
          "name": "text",
          "source": "/document/content"
        }
      ],
      "outputs": [
        {
          "name": "field1",
          "targetName": "field1"
        },
        {
          "name": "field2",
          "targetName": "field2"
        },
        {
          "name": "field3",
          "targetName": "field3"
        }
      ]
    },
    {
      "@odata.type": "#Microsoft.Skills.Util.ShaperSkill",
      "context": "/document",
      "inputs": [
        {
          "name": "field1",
          "source": "/document/field1"
        },
        {
          "name": "field2",
          "source": "/document/field2"
        },
        {
          "name": "field3",
          "source": "/document/field3"
        }
      ],
      "outputs": [
        {
          "name": "output",
          "targetName": "mycomplex"
        }
      ]
    }
    ]
    

    Please refer to the documentation on shaper skill for specifics.