Search code examples
azuresearchmetadataazure-cognitive-search

Is it possible to use metadata as input for custom skill in azure search?


I started working with Azure search a couple of months ago but I have an issue with the metadata of blobfiles.

I need the metadata of a file (coming from Azure Blob) to use it in my customskill. (More specific I need the URL of the blobfile where it's stored).

To do this I need it in my skillset, I would do something like in this image. But that's not possible because the source has to start with /document? If I do "/document/metadata_storage_path/" as "Source" I got a null value in the end?

Is there a way to get the metadata of a file as input to use it further on?

Thanks in advance!


Solution

  • I discovered why it (and the solutions) were not working for me. I hope I can help others that encounter this issue.

    The syntax mentioned above by Sophiac was correct. So in my case I used "metadata_storage_path" as input in the skillset:

    {
          "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
          "description": "Our new substring custom skill",
          "uri": "https://customskillsubstring.azurewebsites.net/api/Translate?code=OkzL7G3wX----jCqQylUyJJPaggSaFQCaQ==",
          "batchSize":1,
          "context": "/document",
          "inputs": [
            {
              "name": "text", "source": "/document/metadata_storage_path"
            }
          ],
          "outputs": [
            {
              "name": "text",
              "targetName": "metadata_storage_path_wathever"
            }
          ]
        }
    

    The issue was in the indexer. I mapped in the fieldmapping the "metadata_storage_path" to something else (in my case "blob_uri"). The problem is this is not really a mapping but more like a replacement. So the "metadata_storage_path" was empty in the skillset because it was already replaced.

    But if I use "blob_uri" it is working. Solution is that you can map one input to more than one thing in the indexer:

    "fieldMappings" : [
            {
              "sourceFieldName" : "metadata_storage_name",
              "targetFieldName" : "id",
              "mappingFunction" :
                { "name" : "base64Encode" }
            },
            {
              "sourceFieldName" : "content",
              "targetFieldName" : "content"
            },
            {
              "sourceFieldName" : "metadata_storage_path",
              "targetFieldName" : "blob_uri"
            },
            {
              "sourceFieldName" : "metadata_storage_path",
              "targetFieldName" : "metadata_storage_path"
            }
       ], 
    

    Now I can use the "blob_uri" and the "metadata_storage_path" as an input for my customskill.