Search code examples
azure-cognitive-search

Azure Search: Is the blob metadata_storage_path, when used as a key, being base64 encoded twice?


If I upload some pdf documents into container blob storage and configure Azure Search to index them with the metadata_storage_path as the key (the default). What's coming out of the REST API appears to be base64 encoded twice.

For example, the path I get from the REST API for this file: https://videoblobstorage.blob.core.windows.net/yatesfiles/Books/ANGULAR_2_COOKBOOK.pdf

enter image description here

Is this: YQBIAFIAMABjAEgATQA2AEwAeQA5ADIAYQBXAFIAbABiADIASgBzAGIAMgBKAHoAZABHADkAeQBZAFcAZABsAEwAbQBKAHMAYgAyAEkAdQBZADIAOQB5AFoAUwA1ADMAYQBXADUAawBiADMAZAB6AEwAbQA1AGwAZABDADkANQBZAFgAUgBsAGMAMgBaAHAAYgBHAFYAegBMADAASgB2AGIAMgB0AHoATAAwAEYATwBSADEAVgBNAFEAVgBKAGYATQBsADkARABUADAAOQBMAFEAawA5AFAAUwB5ADUAdwBaAEcAWQAxAA2

If I attempt to base64 decode it, I get yet another base64 string with a lot of \0 that I have to remove: aHR0cHM6Ly92aWRlb2Jsb2JzdG9yYWdlLmJsb2IuY29yZS53aW5kb3dzLm5ldC95YXRlc2ZpbGVzL0Jvb2tzL0FOR1VMQVJfMl9DT09LQk9PSy5wZGY1

Then if I base64 decode AGAIN, I get the path I expect: https://videoblobstorage.blob.core.windows.net/yatesfiles/Books/ANGULAR_2_COOKBOOK.pdf

I've also tried changing the key to metadata_storage_name and it also gets base64 encoded twice. So, it seems to be associated with the key itself.

What's going on here? Is this a bug?


Solution

  • Yes, this is a bug in the UI and we have a fix that should be deployed no later than 2019 Nov 14:00 PDT.

    Unfortunately you'll need to recreate your Index and Indexer if the double-encoding is a problem. You can wait until the UI update is out and use it to recreate, or you can use a tool such as Postman to manually recreate the Indexer, using the REST documentation as a guide.