Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new WarpReduce overloadings #3884

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

fbusato
Copy link
Contributor

@fbusato fbusato commented Feb 20, 2025

Partial work of #3853

Description

  • Add Min, Max in addition to Sum
  • Add multiple values per thread for Min, Max, Sum and Reduce

@fbusato fbusato added the 3.0 Targeted for 3.0 release label Feb 20, 2025
@fbusato fbusato self-assigned this Feb 20, 2025
Copy link

copy-pr-bot bot commented Feb 20, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@fbusato fbusato changed the title Add missing WarpReduce overloadings Add new WarpReduce overloadings Feb 21, 2025
@fbusato fbusato marked this pull request as ready for review February 21, 2025 01:02
@fbusato fbusato requested a review from a team as a code owner February 21, 2025 01:02
@fbusato fbusato requested a review from gevtushenko February 21, 2025 01:02
Copy link
Contributor

🟨 CI finished in 1h 43m: Pass: 79%/93 | Total: 2d 13h | Avg: 39m 46s | Max: 1h 18m | Hits: 73%/111396
  • 🟨 cub: Pass: 57%/45 | Total: 1d 15h | Avg: 53m 05s | Max: 1h 18m | Hits: 61%/30964

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 06m | Avg:  1h 03m | Max:  1h 04m | Hits:  52%/2092  
      🔍 nvcc               Pass:  55%/43  | Total:  1d 13h | Avg: 52m 37s | Max:  1h 18m | Hits:  61%/28872 
    🟨 ctk
      🟨 12.0               Pass:  40%/5   | Total:  5h 02m | Avg:  1h 00m | Max:  1h 04m | Hits:  49%/2424  
      🟩 12.5               Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 11m | Hits:  47%/2238  
      🟨 12.8               Pass:  57%/38  | Total:  1d 08h | Avg: 51m 13s | Max:  1h 18m | Hits:  63%/26302 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 06m | Avg:  1h 03m | Max:  1h 04m | Hits:  52%/2092  
      🟨 nvcc12.0           Pass:  40%/5   | Total:  5h 02m | Avg:  1h 00m | Max:  1h 04m | Hits:  49%/2424  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 11m | Hits:  47%/2238  
      🟨 nvcc12.8           Pass:  55%/36  | Total:  1d 06h | Avg: 50m 33s | Max:  1h 18m | Hits:  64%/24210 
    🟨 cxx
      🟥 Clang14            Pass:   0%/4   | Total:  3h 57m | Avg: 59m 21s | Max:  1h 01m
      🟥 Clang15            Pass:   0%/2   | Total:  1h 56m | Avg: 58m 04s | Max: 58m 57s
      🟥 Clang16            Pass:   0%/2   | Total:  1h 59m | Avg: 59m 38s | Max:  1h 01m
      🟥 Clang17            Pass:   0%/2   | Total:  1h 52m | Avg: 56m 09s | Max: 57m 22s
      🟨 Clang18            Pass:  28%/7   | Total:  4h 57m | Avg: 42m 32s | Max:  1h 04m | Hits:  52%/2092  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 58m | Avg: 59m 09s | Max:  1h 00m | Hits:  49%/2424  
      🟩 GCC8               Pass: 100%/1   | Total: 56m 45s | Avg: 56m 45s | Max: 56m 45s | Hits:  49%/1212  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 56m | Avg: 58m 00s | Max: 58m 50s | Hits:  49%/2424  
      🟩 GCC10              Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 06m | Hits:  49%/2424  
      🟩 GCC11              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 05m | Hits:  49%/2420  
      🟩 GCC12              Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 03m | Hits:  49%/2420  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 53m | Avg: 37m 32s | Max:  1h 18m | Hits:  76%/13310 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 11m
      🟥 MSVC14.42          Pass:   0%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 16m
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 11m | Hits:  47%/2238  
    🟨 cxx_family
      🟨 Clang              Pass:  11%/17  | Total: 14h 42m | Avg: 51m 56s | Max:  1h 04m | Hits:  52%/2092  
      🟩 GCC                Pass: 100%/22  | Total: 17h 57m | Avg: 48m 57s | Max:  1h 18m | Hits:  63%/26634 
      🟥 MSVC               Pass:   0%/4   | Total:  4h 48m | Avg:  1h 12m | Max:  1h 16m
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 11m | Hits:  47%/2238  
    🟨 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 14m | Avg: 24m 41s | Max: 27m 08s | Hits:  82%/3630  
      🟨 rtx2080            Pass:  52%/34  | Total:  1d 11h | Avg:  1h 02m | Max:  1h 18m | Hits:  49%/21284 
      🟨 rtxa6000           Pass:  62%/8   | Total:  3h 16m | Avg: 24m 36s | Max: 57m 03s | Hits:  89%/6050  
    🟨 jobs
      🟨 Build              Pass:  54%/37  | Total:  1d 13h | Avg:  1h 01m | Max:  1h 18m | Hits:  49%/23704 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 39s | Avg: 20m 39s | Max: 20m 39s | Hits:  99%/1210  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 32s | Avg: 16m 32s | Max: 16m 32s | Hits:  99%/1210  
      🟨 HostLaunch         Pass:  66%/3   | Total: 48m 07s | Avg: 16m 02s | Max: 24m 36s | Hits:  99%/2420  
      🟨 TestGPU            Pass:  66%/3   | Total: 44m 36s | Avg: 14m 52s | Max: 22m 21s | Hits:  99%/2420  
    🟨 cpu
      🟨 amd64              Pass:  58%/43  | Total:  1d 13h | Avg: 52m 44s | Max:  1h 18m | Hits:  61%/29754 
      🟨 arm64              Pass:  50%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  49%/1210  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 14m | Avg: 24m 41s | Max: 27m 08s | Hits:  82%/3630  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 18m | Avg:  1h 18m | Max:  1h 18m | Hits:  49%/1210  
    🟨 std
      🟨 17                 Pass:  55%/20  | Total: 20h 26m | Avg:  1h 01m | Max:  1h 16m | Hits:  49%/13067 
      🟨 20                 Pass:  60%/25  | Total: 19h 22m | Avg: 46m 28s | Max:  1h 18m | Hits:  69%/17897 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 21h 06m | Avg: 28m 08s | Max: 1h 00m | Hits: 78%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 35m 43s | Avg: 17m 51s | Max: 24m 38s | Hits:  89%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 20h 16m | Avg: 28m 16s | Max:  1h 00m | Hits:  78%/76573 
      🟩 arm64              Pass: 100%/2   | Total: 50m 16s | Avg: 25m 08s | Max: 26m 52s | Hits:  79%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 40m | Avg: 32m 01s | Max: 47m 05s | Hits:  74%/8901  
      🟩 12.5               Pass: 100%/2   | Total:  1h 34m | Avg: 47m 11s | Max: 49m 07s | Hits:  72%/3562  
      🟩 12.8               Pass: 100%/38  | Total: 16h 51m | Avg: 26m 37s | Max:  1h 00m | Hits:  79%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 45m 50s | Avg: 22m 55s | Max: 23m 06s | Hits:  79%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 40m | Avg: 32m 01s | Max: 47m 05s | Hits:  74%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 34m | Avg: 47m 11s | Max: 49m 07s | Hits:  72%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 16h 06m | Avg: 26m 50s | Max:  1h 00m | Hits:  79%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 45m 50s | Avg: 22m 55s | Max: 23m 06s | Hits:  79%/3562  
      🟩 nvcc               Pass: 100%/43  | Total: 20h 20m | Avg: 28m 23s | Max:  1h 00m | Hits:  78%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 51m | Avg: 27m 59s | Max: 28m 52s | Hits:  79%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 52m 16s | Avg: 26m 08s | Max: 26m 34s | Hits:  79%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 53m 55s | Avg: 26m 57s | Max: 28m 46s | Hits:  79%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 55m 29s | Avg: 27m 44s | Max: 29m 35s | Hits:  79%/3562  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 17m | Avg: 19m 39s | Max: 25m 35s | Hits:  85%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 52m 34s | Avg: 26m 17s | Max: 26m 43s | Hits:  79%/3564  
      🟩 GCC8               Pass: 100%/1   | Total: 26m 54s | Avg: 26m 54s | Max: 26m 54s | Hits:  79%/1782  
      🟩 GCC9               Pass: 100%/2   | Total: 56m 50s | Avg: 28m 25s | Max: 29m 09s | Hits:  79%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 55m 10s | Avg: 27m 35s | Max: 28m 26s | Hits:  79%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 53m 41s | Avg: 26m 50s | Max: 27m 08s | Hits:  79%/3564  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 00m | Avg: 30m 18s | Max: 31m 39s | Hits:  79%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 36m | Avg: 21m 36s | Max: 32m 06s | Hits:  85%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 34m | Avg: 47m 26s | Max: 47m 47s | Hits:  55%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 24m | Avg: 48m 03s | Max:  1h 00m | Hits:  60%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 34m | Avg: 47m 11s | Max: 49m 07s | Hits:  72%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  6h 51m | Avg: 24m 11s | Max: 29m 35s | Hits:  81%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  8h 41m | Avg: 24m 50s | Max: 32m 06s | Hits:  82%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 59m | Avg: 47m 48s | Max:  1h 00m | Hits:  58%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 34m | Avg: 47m 11s | Max: 49m 07s | Hits:  72%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 42m 33s | Avg: 21m 16s | Max: 26m 01s | Hits:  79%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total: 16h 41m | Avg: 30m 21s | Max: 49m 58s | Hits:  76%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 42m | Avg: 22m 12s | Max:  1h 00m | Hits:  86%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 19h 18m | Avg: 30m 29s | Max:  1h 00m | Hits:  76%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 48m 47s | Avg: 16m 15s | Max: 33m 46s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 58m 38s | Avg: 14m 39s | Max: 26m 01s | Hits:  94%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 42m 33s | Avg: 21m 16s | Max: 26m 01s | Hits:  79%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total: 32m 06s | Avg: 32m 06s | Max: 32m 06s | Hits:  79%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 10h 28m | Avg: 31m 25s | Max: 49m 58s | Hits:  75%/35611 
      🟩 20                 Pass: 100%/23  | Total: 10h 02m | Avg: 26m 11s | Max:  1h 00m | Hits:  81%/40961 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 13m 33s | Avg: 6m 46s | Max: 10m 56s | Hits: 97%/296

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max: 10m 56s | Hits:  97%/296   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max: 10m 56s | Hits:  97%/296   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max: 10m 56s | Hits:  97%/296   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max: 10m 56s | Hits:  97%/296   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max: 10m 56s | Hits:  97%/296   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max: 10m 56s | Hits:  97%/296   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max: 10m 56s | Hits:  97%/296   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 37s | Avg:  2m 37s | Max:  2m 37s | Hits:  97%/148   
      🟩 Test               Pass: 100%/1   | Total: 10m 56s | Avg: 10m 56s | Max: 10m 56s | Hits:  98%/148   
    
  • 🟩 python: Pass: 100%/1 | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

Copy link
Contributor

🟨 CI finished in 1h 40m: Pass: 79%/93 | Total: 2d 12h | Avg: 39m 18s | Max: 1h 13m | Hits: 74%/111396
  • 🟨 cub: Pass: 57%/45 | Total: 1d 15h | Avg: 52m 27s | Max: 1h 13m | Hits: 61%/30964

    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m | Hits:  53%/2092  
      🔍 nvcc               Pass:  55%/43  | Total:  1d 13h | Avg: 52m 00s | Max:  1h 13m | Hits:  62%/28872 
    🟨 ctk
      🟨 12.0               Pass:  40%/5   | Total:  5h 01m | Avg:  1h 00m | Max:  1h 03m | Hits:  49%/2424  
      🟩 12.5               Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m | Hits:  48%/2238  
      🟨 12.8               Pass:  57%/38  | Total:  1d 08h | Avg: 50m 39s | Max:  1h 13m | Hits:  63%/26302 
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m | Hits:  53%/2092  
      🟨 nvcc12.0           Pass:  40%/5   | Total:  5h 01m | Avg:  1h 00m | Max:  1h 03m | Hits:  49%/2424  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m | Hits:  48%/2238  
      🟨 nvcc12.8           Pass:  55%/36  | Total:  1d 06h | Avg: 50m 01s | Max:  1h 13m | Hits:  64%/24210 
    🟨 cxx
      🟥 Clang14            Pass:   0%/4   | Total:  3h 49m | Avg: 57m 17s | Max:  1h 00m
      🟥 Clang15            Pass:   0%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 06m
      🟥 Clang16            Pass:   0%/2   | Total:  1h 51m | Avg: 55m 49s | Max: 56m 02s
      🟥 Clang17            Pass:   0%/2   | Total:  1h 57m | Avg: 58m 39s | Max:  1h 01m
      🟨 Clang18            Pass:  28%/7   | Total:  4h 59m | Avg: 42m 47s | Max:  1h 03m | Hits:  53%/2092  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 21s | Max: 58m 25s | Hits:  49%/2424  
      🟩 GCC8               Pass: 100%/1   | Total: 56m 08s | Avg: 56m 08s | Max: 56m 08s | Hits:  49%/1212  
      🟩 GCC9               Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m | Hits:  49%/2424  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 53m | Avg: 56m 35s | Max: 56m 44s | Hits:  49%/2424  
      🟩 GCC11              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 04m | Hits:  49%/2420  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 45s | Max:  1h 00m | Hits:  49%/2420  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 58m | Avg: 38m 01s | Max:  1h 11m | Hits:  76%/13310 
      🟥 MSVC14.29          Pass:   0%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 11m
      🟥 MSVC14.42          Pass:   0%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 13m
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m | Hits:  48%/2238  
    🟨 cxx_family
      🟨 Clang              Pass:  11%/17  | Total: 14h 40m | Avg: 51m 48s | Max:  1h 06m | Hits:  53%/2092  
      🟩 GCC                Pass: 100%/22  | Total: 17h 45m | Avg: 48m 26s | Max:  1h 11m | Hits:  63%/26634 
      🟥 MSVC               Pass:   0%/4   | Total:  4h 40m | Avg:  1h 10m | Max:  1h 13m
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m | Hits:  48%/2238  
    🟨 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 12m | Avg: 24m 09s | Max: 26m 50s | Hits:  83%/3630  
      🟨 rtx2080            Pass:  52%/34  | Total:  1d 10h | Avg:  1h 01m | Max:  1h 13m | Hits:  49%/21284 
      🟨 rtxa6000           Pass:  62%/8   | Total:  3h 29m | Avg: 26m 11s | Max:  1h 00m | Hits:  89%/6050  
    🟨 jobs
      🟨 Build              Pass:  54%/37  | Total:  1d 13h | Avg:  1h 00m | Max:  1h 13m | Hits:  49%/23704 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 23m 38s | Avg: 23m 38s | Max: 23m 38s | Hits:  99%/1210  
      🟩 GraphCapture       Pass: 100%/1   | Total: 18m 29s | Avg: 18m 29s | Max: 18m 29s | Hits:  99%/1210  
      🟨 HostLaunch         Pass:  66%/3   | Total: 50m 55s | Avg: 16m 58s | Max: 26m 21s | Hits:  99%/2420  
      🟨 TestGPU            Pass:  66%/3   | Total: 46m 09s | Avg: 15m 23s | Max: 25m 06s | Hits:  99%/2420  
    🟨 cpu
      🟨 amd64              Pass:  58%/43  | Total:  1d 13h | Avg: 52m 05s | Max:  1h 13m | Hits:  62%/29754 
      🟨 arm64              Pass:  50%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m | Hits:  49%/1210  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 12m | Avg: 24m 09s | Max: 26m 50s | Hits:  83%/3630  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m | Hits:  49%/1210  
    🟨 std
      🟨 17                 Pass:  55%/20  | Total: 20h 07m | Avg:  1h 00m | Max:  1h 12m | Hits:  49%/13067 
      🟨 20                 Pass:  60%/25  | Total: 19h 12m | Avg: 46m 06s | Max:  1h 13m | Hits:  70%/17897 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 20h 49m | Avg: 27m 46s | Max: 57m 52s | Hits: 79%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 33m 58s | Avg: 16m 59s | Max: 22m 51s | Hits:  89%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 19h 59m | Avg: 27m 53s | Max: 57m 52s | Hits:  79%/76573 
      🟩 arm64              Pass: 100%/2   | Total: 50m 34s | Avg: 25m 17s | Max: 26m 36s | Hits:  79%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 42m | Avg: 32m 26s | Max: 46m 27s | Hits:  74%/8901  
      🟩 12.5               Pass: 100%/2   | Total:  1h 35m | Avg: 47m 39s | Max: 47m 45s | Hits:  72%/3562  
      🟩 12.8               Pass: 100%/38  | Total: 16h 32m | Avg: 26m 07s | Max: 57m 52s | Hits:  80%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 42m 52s | Avg: 21m 26s | Max: 22m 08s | Hits:  79%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 42m | Avg: 32m 26s | Max: 46m 27s | Hits:  74%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 35m | Avg: 47m 39s | Max: 47m 45s | Hits:  72%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 15h 49m | Avg: 26m 22s | Max: 57m 52s | Hits:  80%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 42m 52s | Avg: 21m 26s | Max: 22m 08s | Hits:  79%/3562  
      🟩 nvcc               Pass: 100%/43  | Total: 20h 07m | Avg: 28m 04s | Max: 57m 52s | Hits:  79%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 54m | Avg: 28m 44s | Max: 31m 28s | Hits:  79%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 57m 04s | Avg: 28m 32s | Max: 29m 40s | Hits:  79%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 54m 03s | Avg: 27m 01s | Max: 27m 44s | Hits:  79%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 52m 30s | Avg: 26m 15s | Max: 26m 24s | Hits:  79%/3562  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 17m | Avg: 19m 41s | Max: 27m 53s | Hits:  85%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 53m 54s | Avg: 26m 57s | Max: 27m 39s | Hits:  79%/3564  
      🟩 GCC8               Pass: 100%/1   | Total: 29m 12s | Avg: 29m 12s | Max: 29m 12s | Hits:  79%/1782  
      🟩 GCC9               Pass: 100%/2   | Total: 57m 09s | Avg: 28m 34s | Max: 28m 37s | Hits:  79%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 55m 28s | Avg: 27m 44s | Max: 27m 50s | Hits:  79%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 55m 35s | Avg: 27m 47s | Max: 28m 51s | Hits:  79%/3564  
      🟩 GCC12              Pass: 100%/2   | Total: 57m 34s | Avg: 28m 47s | Max: 29m 20s | Hits:  79%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 17m | Avg: 19m 46s | Max: 32m 55s | Hits:  87%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 37m | Avg: 48m 47s | Max: 51m 08s | Hits:  55%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 14m | Avg: 44m 42s | Max: 57m 52s | Hits:  60%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 35m | Avg: 47m 39s | Max: 47m 45s | Hits:  72%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  6h 56m | Avg: 24m 29s | Max: 31m 28s | Hits:  81%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  8h 26m | Avg: 24m 07s | Max: 32m 55s | Hits:  83%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 51m | Avg: 46m 20s | Max: 57m 52s | Hits:  58%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 35m | Avg: 47m 39s | Max: 47m 45s | Hits:  72%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 27m 39s | Avg: 13m 49s | Max: 16m 13s | Hits:  89%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total: 16h 43m | Avg: 30m 25s | Max: 51m 08s | Hits:  76%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 38m | Avg: 21m 49s | Max: 57m 52s | Hits:  86%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 19h 21m | Avg: 30m 34s | Max: 57m 52s | Hits:  76%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 44m 14s | Avg: 14m 44s | Max: 29m 04s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 00s | Avg: 11m 00s | Max: 11m 26s | Hits:  99%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 27m 39s | Avg: 13m 49s | Max: 16m 13s | Hits:  89%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total: 29m 31s | Avg: 29m 31s | Max: 29m 31s | Hits:  79%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 10h 33m | Avg: 31m 41s | Max: 51m 08s | Hits:  75%/35611 
      🟩 20                 Pass: 100%/23  | Total:  9h 42m | Avg: 25m 18s | Max: 57m 52s | Hits:  81%/40961 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 13m 25s | Avg: 6m 42s | Max: 10m 55s | Hits: 97%/296

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max: 10m 55s | Hits:  97%/296   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max: 10m 55s | Hits:  97%/296   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max: 10m 55s | Hits:  97%/296   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max: 10m 55s | Hits:  97%/296   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max: 10m 55s | Hits:  97%/296   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max: 10m 55s | Hits:  97%/296   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 13m 25s | Avg:  6m 42s | Max: 10m 55s | Hits:  97%/296   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 30s | Avg:  2m 30s | Max:  2m 30s | Hits:  97%/148   
      🟩 Test               Pass: 100%/1   | Total: 10m 55s | Avg: 10m 55s | Max: 10m 55s | Hits:  98%/148   
    
  • 🟩 python: Pass: 100%/1 | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.0 Targeted for 3.0 release
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

1 participant