Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the sparse attention API to be compatible with latest changes of triton #902

Merged
merged 4 commits into from
Jun 2, 2021

Conversation

RezaYazdaniAminabadi
Copy link
Contributor

This PR makes some changes to the parameters passed for the triton kernel to be compatible with the latest Triton.

This address #900 and #838

@sdtblck
Copy link
Contributor

sdtblck commented Apr 5, 2021

Hi @RezaYazdaniAminabadi will this update also make the sparse kernels compatible with different GPU architectures (e.g A100s?)

@RezaYazdaniAminabadi
Copy link
Contributor Author

Hi @RezaYazdaniAminabadi will this update also make the sparse kernels compatible with different GPU architectures (e.g A100s?)

Hi @sdtblck

Yes, I think the main intension of these changes on the Triton is to support A100. I have already tested it on V100. I will run more tests on A100 as well. Please feel free to try them out and let me know if there is any issue.
Thanks,
Reza

@ShivanshuPurohit
Copy link

Hi @RezaYazdaniAminabadi I updated our fork to the sparse-attn/support-latest-triton branch, but even then ds_report shows that A100 isn't compatible with sparsity

--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
cpu_adam ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
 [WARNING]  sparse_attn requires CUDA version 10.1+, does not currently support >=11 or <10.1
sparse_attn ............ [NO] ....... [NO]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
utils .................. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/usr/local/lib/python3.8/dist-packages/torch']
torch version .................... 1.8.0+cu111
torch cuda version ............... 11.1
nvcc version ..................... 11.1
deepspeed install path ........... ['/home/mchorse/gpt-neox/src/deepspeed/deepspeed']
deepspeed info ................... 0.3.13+feb288a, feb288a, HEAD
deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1

But I did notice that triton version in requirements-sparse-attn.txt is 1.0.0.dev20210329 and not 1.1. Could that be the issue here?

@RezaYazdaniAminabadi
Copy link
Contributor Author

Hi @RezaYazdaniAminabadi I updated our fork to the sparse-attn/support-latest-triton branch, but even then ds_report shows that A100 isn't compatible with sparsity

--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
cpu_adam ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
 [WARNING]  sparse_attn requires CUDA version 10.1+, does not currently support >=11 or <10.1
sparse_attn ............ [NO] ....... [NO]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
utils .................. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/usr/local/lib/python3.8/dist-packages/torch']
torch version .................... 1.8.0+cu111
torch cuda version ............... 11.1
nvcc version ..................... 11.1
deepspeed install path ........... ['/home/mchorse/gpt-neox/src/deepspeed/deepspeed']
deepspeed info ................... 0.3.13+feb288a, feb288a, HEAD
deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1

But I did notice that triton version in requirements-sparse-attn.txt is 1.0.0.dev20210329 and not 1.1. Could that be the issue here?

The issue is that we have several places for installation and JIT compile that we guard the usage of higher version of Triton. I am ganna resolve them soon!

@sdtblck
Copy link
Contributor

sdtblck commented Apr 7, 2021

@RezaYazdaniAminabadi can confirm this runs on the A100s 🚀 . I had to make some changes to op builder to pass the compatibility steps https://github.com/EleutherAI/DeeperSpeed/blob/eb9a6a8201215307ba071357a06a9b03c03af3be/op_builder/sparse_attn.py

@mgrankin
Copy link

mgrankin commented Apr 9, 2021

@RezaYazdaniAminabadi sparse-attn/support-latest-triton works for me on Nvidia 3090, thank you 👍

@sdtblck
Copy link
Contributor

sdtblck commented May 6, 2021

Hi @RezaYazdaniAminabadi - it seems triton has totally changed their API and removed triton==1.0.0.dev20210329 from pip - any plans to update to the latest version?

@exelents
Copy link

exelents commented May 12, 2021

Hi @RezaYazdaniAminabadi - it seems triton has totally changed their API and removed triton==1.0.0.dev20210329 from pip - any plans to update to the latest version?

Same problem. I cannot run ruGPT3 on my 3090 because I should use newest CUDA and therefore newest triton. But it seems that triton==1.0.0.dev20210329 was removed from pip, and I can't know what I should do. Newest version of triton==1.0.0.dev20210510 seems not work for me.

UPD:
Here is an error when I try to run ruGPT model:
ImportError: /export/DeepSpeed-triton/deepspeed/ops/sparse_attention/sparse_attn_op.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIN3c107complexIfEEEEPKNS_6detail12TypeMetaDataEv

@exelents
Copy link

UPD2: You can get this error if install older triton==0.4.0 , setup DeepSpeed and import this:

# And this cell should be run without errors
import deepspeed.ops.sparse_attention.sparse_attn_op

@exelents
Copy link

UPD3 - problem have been solved.

@arashashari arashashari merged commit 26e3841 into master Jun 2, 2021
@jeffra jeffra mentioned this pull request Jun 14, 2021
elnardu added a commit to socrash-ai/DeepSpeed that referenced this pull request Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants