-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker image tagged :ollama
does not use GPU/CUDA
#3325
Comments
Do you have the Nvidia Container Toolkit installed? https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html |
Yes I followed those instructions, I also followed https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/sample-workload.html to test my setup so I ran this command:
|
Ahh, try updating your Nvidia driver and CUDA runtime on the host, Ollama needs CUDA 12.x and a 5xx series driver afaik. Which GPU is it? |
NVIDIA GeForce RTX 3060 Laptop GPU |
We've seen some driver issues with 55x, try a 54x driver and CUDA 12.4, I think this should resolve it. |
You were right, it was a nvidia driver issue ! Thanks 👍 ! |
Bug Report
Description
Bug Summary:
Documentation says the following command will run a docker container bundled with
ollama
and useCUDA
but this is not the case : despitenvidia-smi
is available inside the container (I follow nvidia instructions), my GPU is not used and the model run using my CPU.Steps to Reproduce:
Follow your documentation here and use the
With GPU Support:
command lineExpected Behavior:
When I use the chat in Open Webui interface I expect my GPU to be used while the model is processing.
Actual Behavior:
My CPUs are fully used for some times but my GPU usage statys near 0%.
Environment
Open WebUI Version: v0.3.5
Ollama (if applicable): bundled version inside open webui provided container
Operating System: Linux Mint 20.1
Browser (if applicable): Chrome 125
Reproduction Details
Confirmation:
Logs and Screenshots
Browser Console Logs:
[Include relevant browser console logs, if applicable]
Docker Container Logs:
Screenshots (if applicable):
[Attach any relevant screenshots to help illustrate the issue]
Installation Method
Docker image bundled with ollama :
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Additional Information
When running
env
command inside the container, I see some variables having unexpected values (imo) :I tried updating those values by setting my cuda version and
true
toUSE_CUDA_DOCKER
but doing so make thestart.sh
script to fail :I suspect your docker image tagged with
:ollama
cannot use CUDA/GPU 🤔How can I use the ollama bundled version AND CUDA ?
The text was updated successfully, but these errors were encountered: