-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flash Attention is not installed? #595
Comments
try using this image as they have updated to cuda 12.4 there is some driver issue |
When using mistralai/Mistral-7B-Instruct-v0.1, I am getting the below error even if changing the docker image to ghcr.io/predibase/lorax:07addea ImportError: Mistral model requires flash attn v2 |
Any solutions? |
I tried this with Llama 3.1, and the issue was that the Nvidia driver didn't support the CUDA version in the Lorax Docker image. In my case, when I executed commands in the Docker image, CUDA was not available, so I tried different Docker images provided by them. Luckily, ghcr.io/predibase/lorax:07addea worked. Apparently, CUDA 12.4 had a mismatch with Nvidia driver version 550.9. |
Thanks for the help. But after switching to the docker image you gave, I'm facing new errors when running unsloth/Meta-Llama-3.1-8B-bnb-4bit Driver version: 535.183.01, Cuda Version: 12.2 Did you succeed running this one? |
I too had this issue with that particular model. |
Tried the image you suggested, base model is unsloth 70B Llama3.1 getting some dimensions mismatches:
|
Alright, looks like theres a problem with the specific model |
What might be causing the issue? Im on rtx 3060
The text was updated successfully, but these errors were encountered: