Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help #1385

Open
RottenLeader opened this issue Feb 22, 2025 · 4 comments
Open

Need help #1385

RottenLeader opened this issue Feb 22, 2025 · 4 comments

Comments

@RottenLeader
Copy link

Dell precision 5540 , 32GB ram

Running a AI method is very laggy

Image

and I use this AI

https://huggingface.co/allenai/OLMo-2-1124-13B-GGUF/blob/main/OLMo-2-1124-13B-Q5_K_M.gguf

@LostRuins
Copy link
Owner

Do you have a dedicated gpu? If you do, it will be much faster.

Otherwise, try a smaller model, like this: https://huggingface.co/bartowski/L3-8B-Stheno-v3.2-GGUF/resolve/main/L3-8B-Stheno-v3.2-Q4_K_S.gguf?download=true

@RottenLeader
Copy link
Author

This my display adapters

Image

@MoeMonsuta
Copy link

This my display adapters

Image

You might be able to run a 3B tops. https://huggingface.co/models?search=3b%20gguf Try Qwen 2.5 3b or Llama 3.2 3b?

@LostRuins
Copy link
Owner

Ok you should definitely select Use Cublas, that card should support it. That should provide much faster speeds compared to CPU. Try running a 7B model partially offloaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants