JIT: Slightly relax "const vector propagation" heuristics #76788

EgorBo · 2022-10-09T02:27:15Z

By changing too conservative heuristic I introduced in #70378, also changed SPMI to print PerfScore (will revert before merge)

Snippet from #76781

static void Foo(ref byte src, nuint length, ref byte dest)
{
    nuint idx = 0;
    Vector128<byte> mask = Vector128.Create((byte)0x7F);
    if (length >= (uint)Vector128<byte>.Count)
    {
        nuint vecEnd = length - (uint)Vector128<byte>.Count;
        do
        {
            Vector128<byte> vec = Vector128.LoadUnsafe(ref src, idx);
            vec &= mask;
            vec.StoreUnsafe(ref dest, idx);
            idx += (uint)Vector128<byte>.Count;
        } while (idx <= vecEnd);
    }
}

; Method Program:Foo(byref,ulong,byref)
G_M5292_IG01:              
       C5F877               vzeroupper 
G_M5292_IG02:              
       33C0                 xor      eax, eax
+      C4E179100532000000   vmovupd  xmm0, xmmword ptr [reloc @RWD00]
       4883FA10             cmp      rdx, 16
       7223                 jb       SHORT G_M5292_IG05
G_M5292_IG03:              
       4883C2F0             add      rdx, -16
-      90                   align    [1 bytes for IG04]
+      0F1F840000000000     align    [8 bytes for IG04]
G_M5292_IG04:              
-      C5FA6F0401           vmovdqu  xmm0, xmmword ptr [rcx+rax]
-      C5F9DB0513000000     vpand    xmm0, xmm0, xmmword ptr [reloc @RWD00]
-      C4C17A7F0400         vmovdqu  xmmword ptr [r8+rax], xmm0
+      C5F9DB0C01           vpand    xmm1, xmm0, xmmword ptr [rcx+rax]
+      C4C17A7F0C00         vmovdqu  xmmword ptr [r8+rax], xmm1
       4883C010             add      rax, 16
       483BC2               cmp      rax, rdx
       76E4                 jbe      SHORT G_M5292_IG04
G_M5292_IG05:              
       C3                   ret      
RWD00  	dq	7F7F7F7F7F7F7F7Fh, 7F7F7F7F7F7F7F7Fh
-; Total bytes of code: 45
+; Total bytes of code: 53  ;; but better overall perfscore!

ghost · 2022-10-09T02:27:32Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #76781

By changing too conservative heuristic I introduced in #70378, also changed SPMI to print PerfScore (will revert before merge)

Snippet from #76781

static void Foo(ref byte src, nuint length, ref byte dest)
{
    nuint idx = 0;
    Vector128<byte> mask = Vector128.Create((byte)0x7F);
    if (length >= (uint)Vector128<byte>.Count)
    {
        nuint vecEnd = length - (uint)Vector128<byte>.Count;
        do
        {
            Vector128<byte> vec = Vector128.LoadUnsafe(ref src, idx);
            vec &= mask;     // use of hoisted constant

            vec.StoreUnsafe(ref dest, idx);

            idx += (uint)Vector128<byte>.Count;
        } while (idx <= vecEnd);
    }
}

; Method Program:Foo(byref,ulong,byref)
G_M5292_IG01:              
       C5F877               vzeroupper 
G_M5292_IG02:              
       33C0                 xor      eax, eax
+      C4E179100532000000   vmovupd  xmm0, xmmword ptr [reloc @RWD00]
       4883FA10             cmp      rdx, 16
       7223                 jb       SHORT G_M5292_IG05
G_M5292_IG03:              
       4883C2F0             add      rdx, -16
-      90                   align    [1 bytes for IG04]
+      0F1F840000000000     align    [8 bytes for IG04]
G_M5292_IG04:              
-      C5FA6F0401           vmovdqu  xmm0, xmmword ptr [rcx+rax]
-      C5F9DB0513000000     vpand    xmm0, xmm0, xmmword ptr [reloc @RWD00]
-      C4C17A7F0400         vmovdqu  xmmword ptr [r8+rax], xmm0
+      C5F9DB0C01           vpand    xmm1, xmm0, xmmword ptr [rcx+rax]
+      C4C17A7F0C00         vmovdqu  xmmword ptr [r8+rax], xmm1
       4883C010             add      rax, 16
       483BC2               cmp      rax, rdx
       76E4                 jbe      SHORT G_M5292_IG04
G_M5292_IG05:              
       C3                   ret      
RWD00  	dq	7F7F7F7F7F7F7F7Fh, 7F7F7F7F7F7F7F7Fh
-; Total bytes of code: 45
+; Total bytes of code: 53

Author:	EgorBo
Assignees:	EgorBo
Labels:	`area-CodeGen-coreclr`
Milestone:	-

EgorBo · 2022-10-09T13:58:12Z

@jakobbotsch is this a correct way to switch to PerfScore for superpmi-diff job?

jakobbotsch · 2022-10-09T14:04:15Z

@jakobbotsch is this a correct way to switch to PerfScore for superpmi-diff job?

I think it should work depending on the size of the diffs. If there are massive number of diffs then you will hit the problem we always had before #76238, but if it manages to complete I think the jit-analyze output should show perfscore analysis.

EgorBo · 2022-10-14T11:38:11Z

This need more thinking judging by regressions 🙁

ghost assigned EgorBo Oct 9, 2022

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 9, 2022

This was referenced Oct 10, 2022

Tracking issue for CI build timeouts #76454

Closed

Azure DevOps Maintenance in dnceng and dnceng-public organizations dotnet/arcade#11188

Closed

Assertion failed: (*card_word)==0 in DynamicGenerics tests #76801

Closed

EgorBo closed this Oct 14, 2022

EgorBo reopened this Nov 6, 2022

attempt dotnet#3

c1af69a

EgorBo force-pushed the fix-cns-vec-prop2 branch from 9b3ae74 to c1af69a Compare November 6, 2022 01:36

EgorBo added 4 commits November 6, 2022 13:17

Scan multiple defs

00c24ae

Update assertionprop.cpp

b6dbb70

ignore complex asgs

a8e9656

Update assertionprop.cpp

2727f5a

EgorBo closed this Nov 8, 2022

ghost locked as resolved and limited conversation to collaborators Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Slightly relax "const vector propagation" heuristics #76788

JIT: Slightly relax "const vector propagation" heuristics #76788

EgorBo commented Oct 9, 2022 •

edited

Loading

ghost commented Oct 9, 2022

EgorBo commented Oct 9, 2022

jakobbotsch commented Oct 9, 2022

EgorBo commented Oct 14, 2022

JIT: Slightly relax "const vector propagation" heuristics #76788

JIT: Slightly relax "const vector propagation" heuristics #76788

Conversation

EgorBo commented Oct 9, 2022 • edited Loading

ghost commented Oct 9, 2022

EgorBo commented Oct 9, 2022

jakobbotsch commented Oct 9, 2022

EgorBo commented Oct 14, 2022

EgorBo commented Oct 9, 2022 •

edited

Loading