Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fixing the x86 HWIntrinsic tests to not leave a GC hole #16957

Merged
merged 7 commits into from
Mar 19, 2018
Merged

Fixing the x86 HWIntrinsic tests to not leave a GC hole #16957

merged 7 commits into from
Mar 19, 2018

Conversation

tannergooding
Copy link
Member

This resolves https://github.com/dotnet/coreclr/issues/16938, https://github.com/dotnet/coreclr/issues/16370

The tests were incorrectly calling AsPointer on a managed ref.

@tannergooding
Copy link
Member Author

FYI. @fiigii, @CarolEidt, @BruceForstall

@tannergooding
Copy link
Member Author

Is this something that we could readily detect/assert?

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it fixes the issue, I'm happy with it.

Note that you will very likely not get clean gcstress runs, but if the HW intrinsics tests don't fail, that's good enough for this change.

Don't know the answer to your question about an assert.

@CarolEidt
Copy link

Is this something that we could readily detect/assert?

It seems like we should be able to, but I couldn't say for sure without looking at it more closely.

FWIW, it might be worthwhile in future, when changing the templates, to mention (for those who don't review these all the time) that the bulk of the change is auto-generated tests, and only the templates need review.

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CarolEidt
Copy link

@tannergooding - there are still a number of failures in the intrinsics tests, though fewer than before (at least in the test runs I looked at). Have you investigated this?

@tannergooding
Copy link
Member Author

Have you investigated this?

Not yet. I should get to this sometime later tonight.

@tannergooding
Copy link
Member Author

Rebased to include the new tests added. Still investigating the remaining issue with the AVX tests

@tannergooding
Copy link
Member Author

Remaining failures are from tests that work with both 128-bit and 256-bit types.

There are a few that are reading/writing too many fields.

@CarolEidt
Copy link

There are a few that are reading/writing too many fields.

Are you planning to merge this and work on those separately?

@tannergooding
Copy link
Member Author

@CarolEidt, we can if you feel that is better.

I am working on resolving the issues in either case and will have either this PR fixed up or a new one up shortly.

@CarolEidt
Copy link

I can go either way - feel free to merge or wait.

@tannergooding
Copy link
Member Author

3 new commits:

  • First one asserts that we are computing the correct alignment and aren't going to overwrite that data
  • Second one changes the templates to track each vector size (op1, op2, ret, etc) separately
    • We still track "largest" vector size for some cases, such as the data table
  • Third one regenerates the tests according to the above changes

@tannergooding
Copy link
Member Author

Hmm, all our jitstress jobs seem to be "missing" and there are no generator jobs pending....

CC. @dotnet/dnceng, @mmitche

@tannergooding
Copy link
Member Author

Looks like the last generator job timed out and left us in some indeterminate state. I've re-queued the job to see if it will complete this time.

@tannergooding
Copy link
Member Author

tannergooding commented Mar 16, 2018

New generator job succeeded and jobs are back.

Link to one of the previous failures (where the initial job existed, but the downstream job was missing): https://ci.dot.net/job/dotnet_coreclr/job/master/job/x64_checked_osx10.12_innerloop_flow_prtest/1875/console

There isn't a link to one of the jobs where the test phrase worked, but the actual job didn't exist (all the jitstress fell into this category)

@tannergooding
Copy link
Member Author

@CarolEidt, this should hopefully be ready and let all the jobs succeed now

@tannergooding
Copy link
Member Author

There were a handful of tests which basically use modified template code without actually being generated from a template. Those are fixed now as well.

@tannergooding
Copy link
Member Author

Remaining test failures all look to deal with Extract.

Ex:
%CORE_ROOT%\corerun.exe Sse41_r.exe Extract.Single.1

Fails with:

CORECLR! `Object::ValidateInner'::`1'::catch$12 + 0x148 (0x00007ff8`8656cca6)
CORECLR! CallSettingFrame + 0x20 (0x00007ff8`86454980)
CORECLR! _CxxCallCatchBlock + 0x15A (0x00007ff8`86453dba)
NTDLL! RtlCaptureContext + 0x3E3 (0x00007ff8`c9324203)
CORECLR! Object::ValidateInner + 0xC1 (0x00007ff8`85b220c1)
CORECLR! Object::Validate + 0x18A (0x00007ff8`85b21eda)
CORECLR! GcInfoDecoder::ReportRegisterToGC + 0x123 (0x00007ff8`8623f5bf)
CORECLR! GcInfoDecoder::ReportSlotToGC + 0xCD (0x00007ff8`8623f809)
CORECLR! GcInfoDecoder::EnumerateLiveSlots + 0xCED (0x00007ff8`8623ea15)
CORECLR! EECodeManager::EnumGcRefs + 0x375 (0x00007ff8`85f06d55)
    File: e:\repos\coreclr\src\vm\object.cpp Line: 806
    Image: E:\Repos\coreclr\bin\tests\Windows_NT.x64.Checked\Tests\Core_Root\CoreRun.exe

Further investigation shows that it is specifically SimpleUnaryOpTest__ExtractSingle1::RunClsVarScenario causing the failure (and it is caused by the actual Sse41.Extract call).

JitDump.txt <-- I haven't finished investigating this yet.

@fiigii, can you assist with this? It seems to be a more general issue with Extract across multiple ISAs.

@tannergooding
Copy link
Member Author

op1Reg and tmpTargetReg are being passed incorrectly: https://github.com/dotnet/coreclr/blob/master/src/jit/hwintrinsiccodegenxarch.cpp#L1186

Fixing this removes the GCStress failure, but causes the tests to start failing with "incorrect" results

@tannergooding
Copy link
Member Author

And the "incorrect" results are because the instruction is being encoded incorrectly (it was encoded correctly before, with the operands swapped).

@tannergooding
Copy link
Member Author

And the "incorrect" results are because the instruction is being encoded incorrectly (it was encoded correctly before, with the operands swapped).

Ok, the extract instructions have been updated to have an encoding in the MR section of the instruction data tables.

The SSE4.1 tests were all passing locally, hopefully they do the same in CI now.

@fiigii
Copy link

fiigii commented Mar 18, 2018

It seems to be a more general issue with Extract across multiple ISAs.

Sorry for the delay to reply.

the extract instructions have been updated to have an encoding in the MR section of the instruction data tables.

According to the manual, integral extract instructions have MRI, but float extract has RMI encoding with swapped operands.
But RuyJIT was generating SSE2 extract with RM...

@@ -1183,7 +1183,7 @@ void CodeGen::genSSE41Intrinsic(GenTreeHWIntrinsic* node)
if (baseType == TYP_FLOAT)
{
// extract instructions return to GP-registers, so it needs int size as the emitsize
emit->emitIns_SIMD_R_R_I(ins, emitTypeSize(TYP_INT), op1Reg, tmpTargetReg, (int)i);
emit->emitIns_SIMD_R_R_I(ins, emitTypeSize(TYP_INT), tmpTargetReg, op1Reg, (int)i);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I swapped the operands of floating extract because extractps uses RMI but encodes the target in r/m.
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that those letters are meant to describe 'how' it is encoded:
image

The entry explicitly calls out that Operand 1 is encoded in the ModRM byte in the r/m field and that Operand 2 is encoded in the ModRM byte in the reg field. Since Operand 1 is encoded in the r/m field, this makes it an MRI instruction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, the only difference between MR and RM is whether Op1 or Op2 is encoded in the r/m field of the ModRM byte. MR encodes Op1 there and RM encodes Op2 there.

INST3( pextrb, "pextrb" , 0, IUM_WR, 0, 0, SSE3A(0x14), BAD_CODE, SSE3A(0x14)) // Extract Byte
INST3( pextrd, "pextrd" , 0, IUM_WR, 0, 0, SSE3A(0x16), BAD_CODE, SSE3A(0x16)) // Extract Dword
INST3( pextrq, "pextrq" , 0, IUM_WR, 0, 0, SSE3A(0x16), BAD_CODE, SSE3A(0x16)) // Extract Qword
INST3( extractps, "extractps" , 0, IUM_WR, 0, 0, SSE3A(0x17), BAD_CODE, SSE3A(0x17)) // Extract Packed Floating-Point Values
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the can have MR and RM encoding both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. I had originally added both in order to mimic the pre-existing vextractf128 table entries; however, those look to be causing issues as well. Testing locally to see if everything passes.

@tannergooding
Copy link
Member Author

Still a failure in the Sse2 tests, looks to be issues with the ScalarUnOpTest template

@tannergooding
Copy link
Member Author

The ScalarUnOpTest template should be good now as well. The failure was being caused by a bad CopyBlockUnaligned statement that was writing a Vector128<T> when it should have only written T.

However, the logic was also way complicated since these tests all took just T as the first operand, so I simplified the template to only pass around T rather than T* or T[].

@tannergooding
Copy link
Member Author

test Windows_NT x64 Checked gcstress0x3
test Windows_NT x64 Checked gcstress0xc
test Windows_NT x64 Checked gcstress0xc_jitstress1
test Windows_NT x64 Checked gcstress0xc_jitstress2
test Windows_NT x64 Checked gcstress0xc_minopts_heapverify1
test Windows_NT x64 Checked gcstress0xc_zapdisable
test Windows_NT x64 Checked gcstress0xc_zapdisable_heapverify1
test Windows_NT x64 Checked gcstress0xc_zapdisable_jitstress2

test Windows_NT x86 Checked gcstress0x3
test Windows_NT x86 Checked gcstress0xc
test Windows_NT x86 Checked gcstress0xc_jitstress1
test Windows_NT x86 Checked gcstress0xc_jitstress2
test Windows_NT x86 Checked gcstress0xc_minopts_heapverify1
test Windows_NT x86 Checked gcstress0xc_zapdisable
test Windows_NT x86 Checked gcstress0xc_zapdisable_heapverify1
test Windows_NT x86 Checked gcstress0xc_zapdisable_jitstress2

test Ubuntu x64 Checked gcstress0x3
test Ubuntu x64 Checked gcstress0xc
test Ubuntu x64 Checked gcstress0xc_jitstress1
test Ubuntu x64 Checked gcstress0xc_jitstress2
test Ubuntu x64 Checked gcstress0xc_minopts_heapverify1
test Ubuntu x64 Checked gcstress0xc_zapdisable
test Ubuntu x64 Checked gcstress0xc_zapdisable_heapverify1
test Ubuntu x64 Checked gcstress0xc_zapdisable_jitstress2

@tannergooding
Copy link
Member Author

@dotnet-bot test Windows_NT x64 Checked jitstressregs0x1000
@dotnet-bot test Windows_NT x64 Checked jitstressregs0x10
@dotnet-bot test Windows_NT x64 Checked jitstressregs0x80
@dotnet-bot test Windows_NT x64 Checked jitstressregs1
@dotnet-bot test Windows_NT x64 Checked jitstressregs2
@dotnet-bot test Windows_NT x64 Checked jitstressregs3
@dotnet-bot test Windows_NT x64 Checked jitstressregs4
@dotnet-bot test Windows_NT x64 Checked jitstressregs8

@dotnet-bot test Windows_NT x86 Checked jitstressregs0x1000
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x10
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x80
@dotnet-bot test Windows_NT x86 Checked jitstressregs1
@dotnet-bot test Windows_NT x86 Checked jitstressregs2
@dotnet-bot test Windows_NT x86 Checked jitstressregs3
@dotnet-bot test Windows_NT x86 Checked jitstressregs4
@dotnet-bot test Windows_NT x86 Checked jitstressregs8

@dotnet-bot test Ubuntu x64 Checked jitstressregs0x1000
@dotnet-bot test Ubuntu x64 Checked jitstressregs0x10
@dotnet-bot test Ubuntu x64 Checked jitstressregs0x80
@dotnet-bot test Ubuntu x64 Checked jitstressregs1
@dotnet-bot test Ubuntu x64 Checked jitstressregs2
@dotnet-bot test Ubuntu x64 Checked jitstressregs3
@dotnet-bot test Ubuntu x64 Checked jitstressregs4
@dotnet-bot test Ubuntu x64 Checked jitstressregs8

@tannergooding
Copy link
Member Author

Ok, the failing tests are mostly not related, but they should still be investigated.

The exceptions are:

I'm going to keep investigating the remaining HWIntrinsic failures, but will plan on addressing them seperately. I will also work on logging bugs for the unrelated failures

@tannergooding
Copy link
Member Author

FYI. @CarolEidt, @BruceForstall. I have logged a general issue (https://github.com/dotnet/coreclr/issues/17027) to track the further investigation required on the various unrelated JIT Stress failures.

Ideally they would be further investigated and then split out into their own work items, as appropriate.

@CarolEidt
Copy link

@tannergooding - the JIT changes look good. I reviewed the template changes, and they also LGTM. I'm fine with merging this at this point, but let me know if you want me to review the test files you changed manually (unless you converted those to templates?)

@tannergooding
Copy link
Member Author

(unless you converted those to templates?)

@CarolEidt, They have not been converted to templates yet.

@BruceForstall
Copy link
Member

Looks like a lot of jobs are affected by #17008

@tannergooding
Copy link
Member Author

Looks like a lot of jobs are affected by #17008

Yes, this is by far the most widespread and consistent failure.

@tannergooding tannergooding merged commit cd680d4 into dotnet:master Mar 19, 2018
@4creators
Copy link

@tannergooding @fiigii

After this went in I am starting to work on porting remaining tests to templated versions. It would be helpful if we could coordinate our work to avoid overlaps. Right now I plan to work on all scalar tests comprising conversions, loads, stores and comparisons. Any suggestions?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lots of GCStress failures in HWIntrinsics tests
5 participants