I have the following code to send a prompt request to a local LLM, ph-3. Although it shows the correct response in LM studio (check the image), on the VS I receive timeout error. Any help?
var phi3 = new CustomChatCompletionService();
phi3.ModelUrl = "http://localhost:1234/v1/chat/completions";
// semantic kernel builder
var builder = Kernel.CreateBuilder();
builder.Services.AddKeyedSingleton<IChatCompletionService>("microsoft/Phi-3-mini-4k-instruct-gguf", phi3);
var kernel = builder.Build();
// init chat
var chat = kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddSystemMessage("You are a useful assistant that replies using a funny style and emojis. Your name is Goku.");
history.AddUserMessage("hi, who are you?");
// print response
var result = await chat.GetChatMessageContentsAsync(history);
Console.WriteLine(result[^1].Content);
You don't necessarily need a Custom Chat Completion Service to consume your local phi-3 Model.
You can use the OpenAI Connector with the experimental endpoint
override parameter to your LMStudio Http API
var builder = kernel.CreateBuilder()
.AddOpenAIChatCompletion(
modelId: "phi3",
endpoint: new Uri("http://localhost:1234"),
apiKey: null)
.Build();