I am successfully using Azure AI Search pointing at a storage container which lives inside an Azure Storage Account. I have everything working as expected DataSource Index, Indexer and Skill set.
The only issue I cannot solve (I have spent a lot of time searching for a solution and trying various fixes recommended by others but nothing resolves the issue) is that my REST API
search endpoint successfully returns results. And when I decode the Base64 strings manually using a Base64 decoding site they are correctly converted to valid URLs that point to my files in Azure storage. Here is the following base64 string:
aHR0cHM6Ly9yZG1jMDFkZXZhenVyZXNlYXJjaHNhLmJsb2IuY29yZS53aW5kb3dzLm5ldC9yZG1jMDEtZGV2LWRvY3MvMTAucG5n0
And here it is decoded manually:
https://rdmc01devazuresearchsa.blob.core.windows.net/rdmc01-dev-docs/10.png
Here are the full REST API search results:
{
"@odata.context": "https://rdmc01-dev-azure-search-service.search.windows.net/indexes('azureblob-index')/$metadata#docs(*)",
"@odata.count": 4,
"value": [
{
"@search.score": 8.4224205,
"language": "English",
"organizations": [
"Microsoft",
"Open source",
"FEDORA",
"Centos",
"Linux Foundation"
],
"metadata_storage_path": "aHR0cHM6Ly9yZG1jMDFkZXZhenVyZXNlYXJjaHNhLmJsb2IuY29yZS53aW5kb3dzLm5ldC9yZG1jMDEtZGV2LWRvY3MvMTYuZG9jeA2",
"metadata_storage_name": "16.docx"
},
{
"@search.score": 6.806098,
"language": "English",
"organizations": [],
"metadata_storage_path": "aHR0cHM6Ly9yZG1jMDFkZXZhenVyZXNlYXJjaHNhLmJsb2IuY29yZS53aW5kb3dzLm5ldC9yZG1jMDEtZGV2LWRvY3MvMTAucG5n0",
"metadata_storage_name": "10.png"
},
{
"@search.score": 6.806098,
"language": "English",
"organizations": [],
"metadata_storage_path": "aHR0cHM6Ly9yZG1jMDFkZXZhenVyZXNlYXJjaHNhLmJsb2IuY29yZS53aW5kb3dzLm5ldC9yZG1jMDEtZGV2LWRvY3MvbW9sbGllLnBuZw2",
"metadata_storage_name": "mollie.png"
},
{
"@search.score": 6.7477694,
"language": "English",
"organizations": [],
"metadata_storage_path": "aHR0cHM6Ly9yZG1jMDFkZXZhenVyZXNlYXJjaHNhLmJsb2IuY29yZS53aW5kb3dzLm5ldC9yZG1jMDEtZGV2LWRvY3MvMTQuanBn0",
"metadata_storage_name": "14.jpg"
}
]
}
However, when I use .NET C# to decode them I get the following error:
FormatException: The input is not a valid Base-64 string as it contains a
non-base 64 character, more than two padding characters,
or an illegal character among the padding characters.
Any help would be great as I have run out of ideas.
The error is due to padding. The length of base64 should be a multiple of 4.
Use the sample code below:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string base64String = "aHR0cHM6Ly9yZG1jMDFkZXZhenVyZXNlYXJjaHNhLmJsb2IuY29yZS53aW5kb3dzLm5ldC9yZG1jMDEtZGV2LWRvY3MvbW9sbGllLnBuZw2";
var rem = base64String.Length % 4;
base64String += new string('=', 4 - rem);
Console.WriteLine(base64String);
Console.WriteLine(System.Text.Encoding.UTF8.GetString(Convert.FromBase64String(base64String)));
}
}
In this code, I am adding the missing lengths.
Output:
It works for all the file paths provided except 10.png
and 14.jpg
, as they are corrupted somewhere during the process.
Removing the last character 0
resolves errors for both files.