Search code examples
azure-cognitive-search

Set Index Key to Output Field Mapping


In my index, I've a field called id. During my enrichment pipeline I compute a value called /document/documentId, which I'm attempting to map to the id field. However, this mapping does not seem to work as the id always seems to be some long value that looks like a hash. All my other output field mappings work as expected.

Portion of the Index:

{
    'name': 'id',
    'type': 'Edm.String',
    'facetable': false,
    'filterable': true,
    'key': true,
    'retrievable': true,
    'searchable': true,
    'sortable': true,
    'analyzer': null,
    'indexAnalyzer': null,
    'searchAnalyzer': null,
    'synonymMaps': [],
    'fields': []
}

Portion of the Indexer:

'outputFieldMappings': [
    {
        'sourceFieldName': '/document/documentId',
        'targetFieldName': 'id'
    }
]

Expected Value: 4b160942-050f-42b3-bbbb-f4531eb4ad7c

Actual Value: aHR0cHM6Ly9zdGRvY3VtZW50c2Rldi5ibG9iLmNvcmUud2luZG93cy5uZXQvMDNiZTBmMzEtNGMyZC00NDRjLTkzOTQtODJkZDY2YTc4MjNmL29yaWdpbmFscy80YjE2MDk0Mi0wNTBmLTQyYjMtYmJiYi1mNDUzMWViNGFkN2MucGRm0

Any thoughts on how to fix this would be much appreciated!


Solution

  • TL;DR - Can't use output field mappings for Keys. Can only use source fields.

    According to Microsoft, it's not possible to set the document key using the output field mapping. Apparently, there is an issue in cases of deleting documents so the key has to exist straight out of the document.

    I ended up using a mapping function in the fieldMappings.

     "fieldMappings": [
        {
          "sourceFieldName": "metadata_storage_name",
          "targetFieldName": "filename"
        },
        {
          "sourceFieldName": "metadata_storage_name",
          "targetFieldName": "id",
          "mappingFunction": {
            "name": "extractTokenAtPosition",
            "parameters": {
              "delimiter": ".",
              "position": 0
            }
          }
        }
      ]
    

    Since my file name is something like 4b160942-050f-42b3-bbbb-f4531eb4ad7c.pdf then this ends up mapping mapping correctly to my Id.