-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda 11 support? #790
Comments
Sparse Attention currently only work on V100. However we will be updating soon to be compatible on A100 as well. |
Excuse me, I'm having the same problem, as I want to use deepspeed on A-100 GPUs. How can I opt out sparse attention?? Will this have a dramatic impact on efficiency loss? |
@arashashari Hello Arash! Thanks for the super quick reply. Turns out we have another supercomputer with about 160 of those V100 gpus. But also there we only provide CUDA 11.0 for the users. So, it's really about CUDA 11, and not about the gpu type, for us. |
@alexvaca0 you by default will be opted out of installing/using sparse attention in your deepspeed install. You should be able to use all the other features of deepspeed just fine though. These are just warnings saying it won't be able to compile the sparse attention ops on your system. |
@surak: we're actively working on adding support for a100 + cuda 11 for sparse attention. Will hopefully update soon on this thread. Regarding v100 + cuda 11 we suspect this will work as is but have not had a chance or access to a machine with this config to test it out fully. Would you like to give it a try? if so here's a branch that allows this config: |
I had done EXACTLY the same patch myself, in order to at least let the thing install (hadn't have time to test it before, was just making sure everything installed). Thanks! |
Other patches I have are: Pip tries to install a newer triton, and the version does not really matter:
The "or" kinda fails when llvm 10 is present.
Tensorboard already changed
|
@jeffra Hi, hugely excited with this upcoming support! Would this update be compatible with A6000 + cuda11 to utilize the sparse attention? Also, would it have any issue with cuda 11.2? |
What's the recommended way to install this? I tried
but that seems to have issues resolving internal dependencies |
Hey @jeffra would be keen indeed if the SparseAttention kernel is compatible with A100! |
@jeffra I tried to run DeepSpeed https://github.com/microsoft/DeepSpeed/tree/sparse-attn/support-latest-triton on A100 and V100. And both pipelines failed with an error.
@jeffra I have A100 and V100 servers and I'm ready to help you to test different updates operatively. |
Any clues on the exact configuration to run DeepSpeed on CUDA11 and A100s? @surak |
As you saw, I have some patches (mentioned above), and we run it directly on the system, no container involved. |
Hi @surak and @ddomingof, @RezaYazdaniAminabadi and @arashashari merged this PR #902 recently that should help along this line. Have you tried this again with our latest v0.4 release? |
@denti move layer and tensor to cuda. they located in cpu. |
Closing this issue as it is stale - if anyone is hitting this issue, please re-open or link a new issue. Thanks! |
We have 4,000 NVIDIA A100 GPUS and would like to use deepSpeed on them. Thing is, during setup.py:
By the way, the llvm line is wrong, too.
The text was updated successfully, but these errors were encountered: