In my index, I've a field called id
. During my enrichment pipeline I compute a value called /document/documentId
, which I'm attempting to map to the id
field. However, this mapping does not seem to work as the id
always seems to be some long value that looks like a hash. All my other output field mappings work as expected.
Portion of the Index:
{
'name': 'id',
'type': 'Edm.String',
'facetable': false,
'filterable': true,
'key': true,
'retrievable': true,
'searchable': true,
'sortable': true,
'analyzer': null,
'indexAnalyzer': null,
'searchAnalyzer': null,
'synonymMaps': [],
'fields': []
}
Portion of the Indexer:
'outputFieldMappings': [
{
'sourceFieldName': '/document/documentId',
'targetFieldName': 'id'
}
]
Expected Value: 4b160942-050f-42b3-bbbb-f4531eb4ad7c
Actual Value: aHR0cHM6Ly9zdGRvY3VtZW50c2Rldi5ibG9iLmNvcmUud2luZG93cy5uZXQvMDNiZTBmMzEtNGMyZC00NDRjLTkzOTQtODJkZDY2YTc4MjNmL29yaWdpbmFscy80YjE2MDk0Mi0wNTBmLTQyYjMtYmJiYi1mNDUzMWViNGFkN2MucGRm0
Any thoughts on how to fix this would be much appreciated!
TL;DR - Can't use output field mappings for Keys. Can only use source fields.
According to Microsoft, it's not possible to set the document key using the output field mapping. Apparently, there is an issue in cases of deleting documents so the key has to exist straight out of the document.
I ended up using a mapping function in the fieldMappings.
"fieldMappings": [
{
"sourceFieldName": "metadata_storage_name",
"targetFieldName": "filename"
},
{
"sourceFieldName": "metadata_storage_name",
"targetFieldName": "id",
"mappingFunction": {
"name": "extractTokenAtPosition",
"parameters": {
"delimiter": ".",
"position": 0
}
}
}
]
Since my file name is something like 4b160942-050f-42b3-bbbb-f4531eb4ad7c.pdf
then this ends up mapping mapping correctly to my Id.