Search code examples
large-language-modelhuggingfacesemantic-kernel

Making an inference call to HuggingFace in Semantic Kernel causes 404 not found error


I can make serverless inference API calls to the models hosted in HuggingFace using request calls in Python.

I want to achieve the same task using Semantic Kernel in C#.

For this purpose, I import Microsoft.SemanticKernel.Connectors.HuggingFace; and write the following code:

IKernelBuilder builder = Kernel.CreateBuilder();

builder.Services.AddHuggingFaceTextGeneration(
        model: "meta-llama/Meta-Llama-3-70B-Instruct",
        apiKey: "<<my huggingface API goes here>>",
        endpoint: new Uri("https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-70B-Instruct")
);

Kernel kernel = builder.Build();
Task<string?> result = kernel.InvokePromptAsync<string>("What is the capital of Turkey");
Console.WriteLine(result.Result);

However, I receive the following error.

HttpRequestException: Response status code does not indicate success: 404 (Not Found).

Can someone help for solving this issue?


Solution

  • When using the HuggingFace public API you don't need to specify the endpoint, it should work with the code below.

    Endpoint is only needed if you deploy HuggingFace's TGI (Text Generation Inference API)

    IKernelBuilder builder = Kernel.CreateBuilder();
    
    builder.Services.AddHuggingFaceTextGeneration(
            model: "meta-llama/Meta-Llama-3-70B-Instruct",
            apiKey: "<<my huggingface API goes here>>")
    );