-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regressions in Span.IndexerBench #80445
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsRun Information
Regressions in Span.IndexerBench
Reprogit clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Span.IndexerBench*' PayloadsHistogramSpan.IndexerBench.KnownSizeCtor2(length: 1024)
Description of detection logic
DocsProfiling workflow for dotnet/runtime repository
|
Regressed via #79760 cc @MichalPetryka |
Same for |
Need to confirm if this is caused by loop alignment. |
I can see both the loops in the benchmark getting aligned. G_M41548_IG02: ;; offset=0004H
test rcx, rcx
je SHORT G_M41548_IG07
;; NOP compensation instructions of 4 bytes.
mov eax, dword ptr [rcx+08H]
cmp eax, 512
jb SHORT G_M41548_IG07
lea rdx, bword ptr [rcx+10H]
cmp eax, 0x400
jb SHORT G_M41548_IG07
add rcx, 16
add rcx, 512
xor eax, eax
xor r8d, r8d
align [14 bytes for IG03]
;; size=60 bbWeight=1 PerfScore 7.50
G_M41548_IG03: ;; offset=0040H
mov r10d, r8d
movzx r10, byte ptr [rdx+r10]
xor eax, r10d
movzx rax, al
inc r8d
cmp r8d, 512
jl SHORT G_M41548_IG03
;; size=26 bbWeight=4 PerfScore 17.00
G_M41548_IG04: ;; offset=005AH
xor edx, edx
align [4 bytes for IG05]
;; size=6 bbWeight=1 PerfScore 0.50
G_M41548_IG05: ;; offset=0060H
mov r8d, edx
movzx r8, byte ptr [rcx+r8]
xor r8d, eax
movzx rax, r8b
inc edx
cmp edx, 512
jl SHORT G_M41548_IG05 I think the author of #79760 should investigate further. I can still see the regression. I will let @EgorBo decide if we should keep it in .NET 8 or push it to .NET 9. cc: @MichalPetryk |
Realized that this was linux/x64 issue. Here is the disasm with alignment boundaries. One of them is crossing the 32B boundary. G_M41548_IG02: ;; offset=0004H
test rdi, rdi
je SHORT G_M41548_IG07
mov eax, dword ptr [rdi+08H]
cmp eax, 512
jb SHORT G_M41548_IG07
lea rcx, bword ptr [rdi+10H]
cmp eax, 0x400
jb SHORT G_M41548_IG07
add rdi, 16
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (add: 2) 32B boundary ...............................
add rdi, 512
xor eax, eax
xor edx, edx
align [3 bytes for IG03]
;; size=44 bbWeight=1 PerfScore 7.50
G_M41548_IG03: ;; offset=0030H
mov esi, edx
movzx rsi, byte ptr [rcx+rsi]
xor eax, esi
movzx rax, al
inc edx
cmp edx, 512
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (cmp: 4 ; jcc erratum) 32B boundary ...............................
jl SHORT G_M41548_IG03
;; size=22 bbWeight=4 PerfScore 17.00
G_M41548_IG04: ;; offset=0046H
xor ecx, ecx
align [0 bytes for IG05]
;; size=2 bbWeight=1 PerfScore 0.25
G_M41548_IG05: ;; offset=0048H
mov edx, ecx
movzx rdx, byte ptr [rdi+rdx]
xor edx, eax
movzx rax, dl
inc ecx
cmp ecx, 512
jl SHORT G_M41548_IG05 |
so it does have a jcc erratum and unaligned and there is no other suspicious codegen pieces? I guess we can then close it or move out of net8.0 as non actionable at this point |
Is there any issue tracking introduction of a workaround for the JCC erratum? |
|
Run Information
Regressions in Span.IndexerBench
Test Report
Repro
Payloads
Baseline
Compare
Histogram
Span.IndexerBench.KnownSizeCtor2(length: 1024)
Description of detection logic
Docs
Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository
The text was updated successfully, but these errors were encountered: