Support broadcast index put #3421

chohk88 · 2025-02-27T21:13:17Z

Description

Support broadcasting index_put (Add test case)
Indexing with None type is WIP.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/examples/dynamo/torch_compile_pg.py	2025-02-27 21:13:32.356782+00:00
+++ /home/runner/work/TensorRT/TensorRT/examples/dynamo/torch_compile_pg.py	2025-02-27 21:13:50.969706+00:00
@@ -9,17 +9,22 @@
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg"
image = load_image(url)


model = PaliGemmaForConditionalGeneration.from_pretrained(
-    model_id, torch_dtype=torch.float16).eval()
+    model_id, torch_dtype=torch.float16
+).eval()
model.to(DEVICE).to(torch.float16)
# model.forward = model.forward.to(torch.float16).eval()

processor = PaliGemmaProcessor.from_pretrained(model_id)
prompt = ""
-model_inputs = processor(text=prompt, images=image, return_tensors="pt").to(torch.float16).to(DEVICE) # to(DEVICE) # .to(torch.float16).to(DEVICE)
+model_inputs = (
+    processor(text=prompt, images=image, return_tensors="pt")
+    .to(torch.float16)
+    .to(DEVICE)
+)  # to(DEVICE) # .to(torch.float16).to(DEVICE)
input_len = model_inputs["input_ids"].shape[-1]

# model.config.token_healing = False

with torch.inference_mode():
@@ -49,13 +54,15 @@
            # "use_fp32_acc": True,
            "debug": True,
            # "use_aot_joint_export":False,
        },
    )
-    
+
    with torch.inference_mode():
-        trt_generation = model.generate(**model_inputs, max_new_tokens=100, do_sample=False) 
+        trt_generation = model.generate(
+            **model_inputs, max_new_tokens=100, do_sample=False
+        )
        trt_generation_out = trt_generation[0][input_len:]
        trt_decoded = processor.decode(trt_generation_out, skip_special_tokens=True)
        print(trt_generation)
        print("TensorRT generated text:")
-        print(trt_decoded)
\ No newline at end of file
+        print(trt_decoded)

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/examples/dynamo/torch_compile_pg.py	2025-02-28 01:35:45.736123+00:00
+++ /home/runner/work/TensorRT/TensorRT/examples/dynamo/torch_compile_pg.py	2025-02-28 01:36:05.082683+00:00
@@ -9,17 +9,22 @@
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg"
image = load_image(url)


model = PaliGemmaForConditionalGeneration.from_pretrained(
-    model_id, torch_dtype=torch.float16).eval()
+    model_id, torch_dtype=torch.float16
+).eval()
model.to(DEVICE).to(torch.float16)
# model.forward = model.forward.to(torch.float16).eval()

processor = PaliGemmaProcessor.from_pretrained(model_id)
prompt = ""
-model_inputs = processor(text=prompt, images=image, return_tensors="pt").to(torch.float16).to(DEVICE) # to(DEVICE) # .to(torch.float16).to(DEVICE)
+model_inputs = (
+    processor(text=prompt, images=image, return_tensors="pt")
+    .to(torch.float16)
+    .to(DEVICE)
+)  # to(DEVICE) # .to(torch.float16).to(DEVICE)
input_len = model_inputs["input_ids"].shape[-1]

# model.config.token_healing = False

with torch.inference_mode():
@@ -49,13 +54,15 @@
            # "use_fp32_acc": True,
            "debug": True,
            # "use_aot_joint_export":False,
        },
    )
-    
+
    with torch.inference_mode():
-        trt_generation = model.generate(**model_inputs, max_new_tokens=100, do_sample=False) 
+        trt_generation = model.generate(
+            **model_inputs, max_new_tokens=100, do_sample=False
+        )
        trt_generation_out = trt_generation[0][input_len:]
        trt_decoded = processor.decode(trt_generation_out, skip_special_tokens=True)
        print(trt_generation)
        print("TensorRT generated text:")
-        print(trt_decoded)
\ No newline at end of file
+        print(trt_decoded)

facebook-github-bot added the cla signed label Feb 27, 2025

chohk88 self-assigned this Feb 27, 2025

github-actions bot requested a review from narendasan February 27, 2025 21:13

github-actions bot requested changes Feb 27, 2025

View reviewed changes

Chengzhe Xu added 2 commits February 28, 2025 01:35

feat: support broadcasting index_put

2bafe0e

chore: linting

175f191

chohk88 force-pushed the support_broadcast_index_put branch from 80bef6e to 175f191 Compare February 28, 2025 01:35

github-actions bot requested changes Feb 28, 2025

View reviewed changes

chohk88 marked this pull request as draft February 28, 2025 05:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support broadcast index put #3421

Support broadcast index put #3421

chohk88 commented Feb 27, 2025

github-actions bot left a comment

github-actions bot left a comment

Support broadcast index put #3421

Are you sure you want to change the base?

Support broadcast index put #3421

Conversation

chohk88 commented Feb 27, 2025

Description

Type of change

Checklist:

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment