Search code examples
pythonartificial-intelligencellamallamacpp

How to run Llama.cpp with CuBlas on windows?


I have been using Llama.cpp and running the model on my Mac (only CPU) but now I wanted to switch to Windows and run it on a GPU but when I try CuBlas build, I cannot seem to execute ./main or ./server file at all. Any idea what might be wrong? or want can be done? Here is what I am facing when I build with CuBlas,

./main.exe -m ./models/7B/llama-2-7b-chat.Q4_K_M.gguf -n 128 -ngl 40

./main.exe : The term './main.exe' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was 
included, verify that the path is correct and try again.
At line:1 char:1
+ ./main.exe -m ./models/7B/llama-2-7b-chat.Q4_K_M.gguf -n 128 -ngl 40
+ ~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (./main.exe:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

Solution

  • So, I found the solution to this question. The way I had built it was wrong. I used the method that was supposed to be used for Mac. Generally, I should follow a completely different approach for building on Windows. Here is the link to the GitHub repo for llama.cpp, which has steps to build on Windows.

    Also, cuBlash has to be made for Windows but do not do it in the way you would do it for Mac. Windows have a different approach. Once you build it correctly you will see the main.exe and server.exe files in the main directory.