-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need help #1385
Comments
Do you have a dedicated gpu? If you do, it will be much faster. Otherwise, try a smaller model, like this: https://huggingface.co/bartowski/L3-8B-Stheno-v3.2-GGUF/resolve/main/L3-8B-Stheno-v3.2-Q4_K_S.gguf?download=true |
You might be able to run a 3B tops. https://huggingface.co/models?search=3b%20gguf Try Qwen 2.5 3b or Llama 3.2 3b? |
Ok you should definitely select Use Cublas, that card should support it. That should provide much faster speeds compared to CPU. Try running a 7B model partially offloaded. |
Dell precision 5540 , 32GB ram
Running a AI method is very laggy
and I use this AI
https://huggingface.co/allenai/OLMo-2-1124-13B-GGUF/blob/main/OLMo-2-1124-13B-Q5_K_M.gguf
The text was updated successfully, but these errors were encountered: