Search code examples
azurenetwork-programmingartificial-intelligenceollama

Trying to get API response from ollama setup in Azure virtual machine (ubuntu)


I installed and configured the ollama on my Azure virtual machine running ubuntu and trying to make the API call from another machine, kind of like I'm trying to set up my own ollama server and facing issue with API connection.

I tried running the local host API such as:

curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?",
  "response": "The",
  "done": false
}'

This was successful,

I set up inbound rule to my VM port 11434, and tried to API call using the VM's public IP I got failed to connect: connection refused

Should I be using any password or authentication? Like what am I missing?

curl http://<public ip>:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?",
  "response": "The",
  "done": false
}'

Solution

  • Upon going through docs, first we need to run Ollama server by setting the host port and the allowed origins to communicate with it.

    Run

    export OLLAMA_HOST="0.0.0.0:8888" OLLAMA_ORIGINS="*" ollama serve
    

    * for all if you want to use particular ip use http:// or https:// followed by the IP you want to allow.

    Then start the Ollama server by

    ollama serve
    

    Then run the API, such as

    curl http://<pub-ip>:8888/api/pull -d '{
      "name": "llama2"
    }'
    

    To pull and image only once is enough (you can pull your desired model)

    and

    curl http://<pub-ip>:8888/api/generate -d '{
      "model": "llama2",
      "prompt":"Why is the sky blue?",
      "response": "The",
      "done": false
    }'