Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regressions in Span.IndexerBench #80445

Closed
performanceautofiler bot opened this issue Jan 10, 2023 · 10 comments
Closed

Regressions in Span.IndexerBench #80445

performanceautofiler bot opened this issue Jan 10, 2023 · 10 comments
Assignees
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) Priority:2 Work that is important, but not critical for the release tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 1801b56a60576cdd156134ad80872dbf984e62e7
Compare 1808d1c812ba1d1c2873023685530c853eb8dce1
Diff Diff

Regressions in Span.IndexerBench

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
KnownSizeCtor2 - Duration of single invocation 392.43 ns 512.56 ns 1.31 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Span.IndexerBench*'

Payloads

Baseline
Compare

Histogram

Span.IndexerBench.KnownSizeCtor2(length: 1024)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 512.5618748353747 > 411.9805494965428.
IsChangePoint: Marked as a change because one of 1/6/2023 5:32:27 AM, 1/9/2023 6:56:09 PM falls between 1/1/2023 3:57:25 AM and 1/9/2023 6:56:09 PM.
IsRegressionStdDev: Marked as regression because -538.9636427367564 (T) = (0 -512.8962182135796) / Math.Sqrt((0.7929035108989139 / (31)) + (0.33774347131446564 / (14))) is less than -2.016692199226234 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (14) - 2, .025) and -0.30594382410365073 = (392.7398780461416 - 512.8962182135796) / 392.7398780461416 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@performanceautofiler performanceautofiler bot added the untriaged New issue has not been triaged by the area owner label Jan 10, 2023
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Jan 10, 2023
@EgorBo EgorBo changed the title [Perf] Linux/x64: 1 Regression on 1/6/2023 2:04:15 PM Regressions in Span.IndexerBench Jan 10, 2023
@EgorBo EgorBo transferred this issue from dotnet/perf-autofiling-issues Jan 10, 2023
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 10, 2023
@EgorBo EgorBo added os-linux Linux OS (any supported distro) tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Jan 10, 2023
@ghost
Copy link

ghost commented Jan 10, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Run Information

Architecture x64
OS ubuntu 18.04
Baseline 1801b56a60576cdd156134ad80872dbf984e62e7
Compare 1808d1c812ba1d1c2873023685530c853eb8dce1
Diff Diff

Regressions in Span.IndexerBench

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
KnownSizeCtor2 - Duration of single invocation 392.43 ns 512.56 ns 1.31 0.00 False

graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
python3 .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'Span.IndexerBench*'

Payloads

Baseline
Compare

Histogram

Span.IndexerBench.KnownSizeCtor2(length: 1024)


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsRegressionWindowed: Marked as regression because 512.5618748353747 > 411.9805494965428.
IsChangePoint: Marked as a change because one of 1/6/2023 5:32:27 AM, 1/9/2023 6:56:09 PM falls between 1/1/2023 3:57:25 AM and 1/9/2023 6:56:09 PM.
IsRegressionStdDev: Marked as regression because -538.9636427367564 (T) = (0 -512.8962182135796) / Math.Sqrt((0.7929035108989139 / (31)) + (0.33774347131446564 / (14))) is less than -2.016692199226234 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (31) + (14) - 2, .025) and -0.30594382410365073 = (392.7398780461416 - 512.8962182135796) / 392.7398780461416 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Author: performanceautofiler[bot]
Assignees: dakersnar
Labels:

os-linux, tenet-performance, tenet-performance-benchmarks, arch-x64, area-CodeGen-coreclr, untriaged

Milestone: -

@EgorBo
Copy link
Member

EgorBo commented Jan 10, 2023

Regressed via #79760 cc @MichalPetryka
This benchmark can be found in SPMI diffs (regressions) https://dev.azure.com/dnceng-public/public/_build/results?buildId=127048&view=ms.vss-build-web.run-extensions-tab

@EgorBo
Copy link
Member

EgorBo commented Jan 10, 2023

Same for Regressions in System.Collections.IterateForEach<String> benchmark in dotnet/perf-autofiling-issues#11472

@kunalspathak
Copy link
Member

Need to confirm if this is caused by loop alignment.

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Jan 12, 2023
@JulieLeeMSFT JulieLeeMSFT added this to the 8.0.0 milestone Jan 12, 2023
@kunalspathak kunalspathak added the Priority:2 Work that is important, but not critical for the release label Jun 5, 2023
@kunalspathak
Copy link
Member

I can see both the loops in the benchmark getting aligned.

G_M41548_IG02:  ;; offset=0004H
       test     rcx, rcx
       je       SHORT G_M41548_IG07
		  ;; NOP compensation instructions of 4 bytes.
       mov      eax, dword ptr [rcx+08H]
       cmp      eax, 512
       jb       SHORT G_M41548_IG07
       lea      rdx, bword ptr [rcx+10H]
       cmp      eax, 0x400
       jb       SHORT G_M41548_IG07
       add      rcx, 16
       add      rcx, 512
       xor      eax, eax
       xor      r8d, r8d
       align    [14 bytes for IG03]
						;; size=60 bbWeight=1 PerfScore 7.50
G_M41548_IG03:  ;; offset=0040H
       mov      r10d, r8d
       movzx    r10, byte  ptr [rdx+r10]
       xor      eax, r10d
       movzx    rax, al
       inc      r8d
       cmp      r8d, 512
       jl       SHORT G_M41548_IG03
						;; size=26 bbWeight=4 PerfScore 17.00
G_M41548_IG04:  ;; offset=005AH
       xor      edx, edx
       align    [4 bytes for IG05]
						;; size=6 bbWeight=1 PerfScore 0.50
G_M41548_IG05:  ;; offset=0060H
       mov      r8d, edx
       movzx    r8, byte  ptr [rcx+r8]
       xor      r8d, eax
       movzx    rax, r8b
       inc      edx
       cmp      edx, 512
       jl       SHORT G_M41548_IG05

I think the author of #79760 should investigate further. I can still see the regression.

image

I will let @EgorBo decide if we should keep it in .NET 8 or push it to .NET 9.

cc: @MichalPetryk

@kunalspathak kunalspathak assigned EgorBo and unassigned kunalspathak Aug 9, 2023
@kunalspathak
Copy link
Member

Realized that this was linux/x64 issue. Here is the disasm with alignment boundaries. One of them is crossing the 32B boundary.

G_M41548_IG02:  ;; offset=0004H
       test     rdi, rdi
       je       SHORT G_M41548_IG07
       mov      eax, dword ptr [rdi+08H]
       cmp      eax, 512
       jb       SHORT G_M41548_IG07
       lea      rcx, bword ptr [rdi+10H]
       cmp      eax, 0x400
       jb       SHORT G_M41548_IG07
       add      rdi, 16
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (add: 2) 32B boundary ...............................
       add      rdi, 512
       xor      eax, eax
       xor      edx, edx
       align    [3 bytes for IG03]
                                                ;; size=44 bbWeight=1 PerfScore 7.50
G_M41548_IG03:  ;; offset=0030H
       mov      esi, edx
       movzx    rsi, byte  ptr [rcx+rsi]
       xor      eax, esi
       movzx    rax, al
       inc      edx
       cmp      edx, 512
; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (cmp: 4 ; jcc erratum) 32B boundary ...............................
       jl       SHORT G_M41548_IG03
                                                ;; size=22 bbWeight=4 PerfScore 17.00
G_M41548_IG04:  ;; offset=0046H
       xor      ecx, ecx
       align    [0 bytes for IG05]
                                                ;; size=2 bbWeight=1 PerfScore 0.25
G_M41548_IG05:  ;; offset=0048H
       mov      edx, ecx
       movzx    rdx, byte  ptr [rdi+rdx]
       xor      edx, eax
       movzx    rax, dl
       inc      ecx
       cmp      ecx, 512
       jl       SHORT G_M41548_IG05

@EgorBo
Copy link
Member

EgorBo commented Aug 9, 2023

so it does have a jcc erratum and unaligned and there is no other suspicious codegen pieces? I guess we can then close it or move out of net8.0 as non actionable at this point

@MichalPetryka
Copy link
Contributor

Is there any issue tracking introduction of a workaround for the JCC erratum?

@EgorBo
Copy link
Member

EgorBo commented Aug 9, 2023

Is there any issue tracking introduction of a workaround for the JCC erratum?

#35730

@EgorBo EgorBo closed this as completed Aug 9, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Sep 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-linux Linux OS (any supported distro) Priority:2 Work that is important, but not critical for the release tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

5 participants