Search code examples
c#large-language-modelsemantic-kernellm-studio

Running LLMs locally causing this error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing


I have the following code to send a prompt request to a local LLM, ph-3. Although it shows the correct response in LM studio (check the image), on the VS I receive timeout error. Any help?

var phi3 = new CustomChatCompletionService();
phi3.ModelUrl = "http://localhost:1234/v1/chat/completions";

// semantic kernel builder
var builder = Kernel.CreateBuilder();
builder.Services.AddKeyedSingleton<IChatCompletionService>("microsoft/Phi-3-mini-4k-instruct-gguf", phi3);
var kernel = builder.Build();

// init chat
var chat = kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddSystemMessage("You are a useful assistant that replies using a funny style and emojis. Your name is Goku.");
history.AddUserMessage("hi, who are you?");

// print response
var result = await chat.GetChatMessageContentsAsync(history);
Console.WriteLine(result[^1].Content);

enter image description here


Solution

  • You don't necessarily need a Custom Chat Completion Service to consume your local phi-3 Model.

    You can use the OpenAI Connector with the experimental endpoint override parameter to your LMStudio Http API

    var builder = kernel.CreateBuilder()
        .AddOpenAIChatCompletion(
            modelId: "phi3", 
            endpoint: new Uri("http://localhost:1234"), 
            apiKey: null)
        .Build();