-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unet results wrong of TensorRT 10.x when running on GPU L40s #4351
Comments
I use polygraphy to debug the tensorrt model layer precision. I got another problem. 2, With command Why the result difference under the same tools and model? |
By debugging the model line by line, I found that with the original code |
I finally found that all the wired results were come from the below code.
I guess that q/k tensor will be fused in the tensorrt model, and the fused node caused problem(guess some bugs were there). So I return the q/k to the model output by do some overhead operation(caculate mean value and return here), so that the q/k tensor won't be fused by tensorrt model. And then model outputs are inconsistent with onnx/torch with 3% perf down. Now I am looking for other ways to mark the q/k tensor so that it won't be fused in the tensorrt model and won't introduce additional operations. |
Description
Im trying to convert a unet(model size 1.9GB/fp32) with opset=17 to tensorrt.
trt8.6 version results was correct, but got 15% performance down.
trt10.0.0/10.5/10.8 results were nan.
Is there some high level optimization(eg op fusion) was introduced to trt10.x which may cause the nan results?
How can I debug the inference procedure to fix my pytorch model, and can run it correctlly on trt10.x.
Environment
TensorRT Version:8.6, 10.0, 10.5, 10.8
NVIDIA GPU:L40s
NVIDIA Driver Version:535.161.08
CUDA Version:12.2
CUDNN Version:8.4
Operating System:Ubuntu 20.04
Python Version (if applicable):3.10
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?:
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
):The text was updated successfully, but these errors were encountered: