Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] m系列的mac启用gpu #152

Closed
kingzeus opened this issue Mar 19, 2023 · 8 comments
Closed

[Feature] m系列的mac启用gpu #152

kingzeus opened this issue Mar 19, 2023 · 8 comments

Comments

@kingzeus
Copy link

Is your feature request related to a problem? Please describe.

mac下cpu运行非常慢
pytorch 在m系列的mac上可以支持GPU加速

Solutions

可以通过以下函数判断
torch.backends.mps.is_available()

修改 .half().cuda()float().to("mps")

运行返回
image

有解决办法么?

Additional context

No response

@chaucerling
Copy link

#6 (comment)

int64 is supported on MacOS 13.3 Βeta, and you should also use the nightly build of pytorch.

I tried to use mps backend to run on gpu, but it seems to have a bug when calling the generate function.

@kingzeus
Copy link
Author

kingzeus commented Mar 19, 2023

#6 (comment)

int64 is supported on MacOS 13.3 Βeta, and you should also use the nightly build of pytorch.

I tried to use mps backend to run on gpu, but it seems to have a bug when calling the generate function.

it seems to work! !

步骤:

  1. 修改 .half().cuda() 成 float().to("mps")
  2. 修改 modeling_chatglm.py line33-37
# flags required to enable jit fusion kernels
# torch._C._jit_set_profiling_mode(False)
# torch._C._jit_set_profiling_executor(False)
# torch._C._jit_override_can_fuse_on_cpu(True)
# torch._C._jit_override_can_fuse_on_gpu(True)
  1. 修改 modeling_chatglm.py line 268
 dtype = attention_scores.dtype
  1. 运行

image

目前会有些警告,但是似乎不影响使用
cpu模式下大概300s左右,
经过以上修改,仅需5-8s左右即可,多1轮回答,内存大概增加2G左右

需要改进的地方:
模型需要下载到本地,才能修改 modeling_chatglm.py。现有代码结构下,似乎没有很好的解决办法

@imClumsyPanda
Copy link
Contributor

@kingzeus 您好 请问您测试环境的pytorch版本和macOS版本分别是多少呢?

@kingzeus
Copy link
Author

@kingzeus 您好 请问您测试环境的pytorch版本和macOS版本分别是多少呢?

macOS 13.2.1
torch 2.0.0
torchaudio 2.0.0.dev20230313
torchvision 0.15.1

@LeeeSe
Copy link

LeeeSe commented Mar 21, 2023

@kingzeus 请问如何绕过 cpm_kernels 的 RuntimeError: Unknown platform: darwin 报错?

@kingzeus
Copy link
Author

@kingzeus 请问如何绕过 cpm_kernels 的 RuntimeError: Unknown platform: darwin 报错?

目前来看,最简单的方法不要调用量化函数/不使用int4模型

@duzx16
Copy link
Member

duzx16 commented Mar 23, 2023

#6 (comment)
int64 is supported on MacOS 13.3 Βeta, and you should also use the nightly build of pytorch.
I tried to use mps backend to run on gpu, but it seems to have a bug when calling the generate function.

it seems to work! !

步骤:

  1. 修改 .half().cuda() 成 float().to("mps")
  2. 修改 modeling_chatglm.py line33-37
# flags required to enable jit fusion kernels
# torch._C._jit_set_profiling_mode(False)
# torch._C._jit_set_profiling_executor(False)
# torch._C._jit_override_can_fuse_on_cpu(True)
# torch._C._jit_override_can_fuse_on_gpu(True)
  1. 修改 modeling_chatglm.py line 268
 dtype = attention_scores.dtype
  1. 运行
image

目前会有些警告,但是似乎不影响使用 cpu模式下大概300s左右, 经过以上修改,仅需5-8s左右即可,多1轮回答,内存大概增加2G左右

需要改进的地方: 模型需要下载到本地,才能修改 modeling_chatglm.py。现有代码结构下,似乎没有很好的解决办法

@kingzeus 感谢你提供的方法。我们已经修改了HF hub上的 modeling_chatglm.py,现在可以直接运行。另外将.float()改为.half()可以节省内存。

@tedyyu
Copy link

tedyyu commented Mar 27, 2023

但是直接运行 python web_demo.py 就会遇到这个错,请问怎么避免?

@kingzeus 请问如何绕过 cpm_kernels 的 RuntimeError: Unknown platform: darwin 报错?

目前来看,最简单的方法不要调用量化函数/不使用int4模型

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants