-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider mitigating the JCC erratum in the JIT (Nov. 2019) #13795
Comments
cc @AndyAyersMS |
Would be helpful to collect more data on the observed perf impact, so we can decide how to prioritize this. Perhaps extra motivation to 32-byte align the code buffer (see #9912 and #8108) for x86/x64; without doing that, we can't possibly fix this. Could possibly do it just for Tier1 methods to try and minimize the fragmentation. Implementing a fix for this in the jit may be non-trivial. Adding nops to fix one branch alignment may alter span-dependent instruction lengths and so push other branches into bad alignments. Handling macro fused ops adds a bit of complexity too. Probably needs to be handled as part of branch tensioning. |
Would love to run the benchmarks in a couple+ of different machines
(broadwell, kaby lake, skylake-x).
Is there a specific set of benchmarks that you would like me to use?
CC @adamsitnik
|
Seems like someone over the GraalVM project re-ran their benchmark suite, and I think their results pretty much reflect what I've experienced sporadically and that is that Intel is lying through their teeth regarding the 0-5% perf drop (Very impressive and concise visualization of 1000 benchmark results, regardless): I think it's clear, given that EVERY result on the screenshot is degraded more than 5% that this is not something that is un-noticeable... I'd love to get the chance to run a bunch of benchmarks for JIT'd code that are acceptable for the CoreCLR team and provide some results. I can start looking for large suites of .NET BenchmarkDotNet projects, but I'm assuming there is already something that all of the JIT devs are using as some sort of reference much like the GraalVM people? |
https://github.com/dotnet/performance ... it is run roughly every 4 hours for .NET Core and I run it once a day for Mono. |
Note that the image quoted above has a slider set to "Ignore deviations below 5%" - perhaps that explains the why every result on the screenshot is degraded more than 5%? I do believe it's very important that this issue gets addressed, but want to make sure we're interpreting the results correctly. |
There is a link in the issue to the full analysis:
* 14 benchmarks are better (improved by > 5%)
* 752 are degraded by more than 5%
* 180 unchanged (+/- 5%)
|
Thanks folks for continuing to engage on this. I've been looking into the performance differences here as well, but don't yet have something worth sharing. I'm aiming for a mix of micro-benchmarks and some that look a bit more realistic. My goal is to understand the general performance cost, as well as which code is responsible for that cost (native vs. managed). From there we can start looking at mitigations. With respect to the data above, the summary looks very similar to what I'm seeing in the microbenchmarks. |
Hi, I was wondering, now that it's been about a month since the last update on this issue and 18 days since @BruceForstall added a 5.0 milestone for this issue if there has been any formal decision I realize that the 5.0 milestone doesn't necessarily mean that any mitigation will be added at any level (managed/unmanaged code). If no decision has been made yet, I'd like to understand if the research that was performed by @brianrob was concluded, and if so, would it be possible to share the results of the impact that was measured? |
@damageboy thanks for reaching out on this. I was out for the holidays and have picked this work back up this week. I'm not done yet, but plan to share methodology and results once it's ready. Also expect that we'll be interested to hear from folks like yourself on how your apps are impacted to help guide a decision making process. All that to say that so far things have seemed fairly closed, but I expect to open it up more soon - I just don't have data that is in a form worth sharing, and that actually helps back up a conclusion just yet. Thanks for your patience. |
Sure thing. I will point out, from the get-go, that for us, the only sensible decision was to make sure this microcode update never gets applied. So in effect, we are experiencing no impact what-so-ever, since we have basically, at this point, given up any security expectations from Intel processors. I will point out that in our testing the impact was considerable, and while it was definitely far from the worse case I reported with 20% drop in perf for |
@brianrob if there are new findings/conclusions about this issue I would be happy if you could share them here... I personally think it's also completely fine to mark this as Maybe you could share what MSFT is doing internally about this? We've all heard previously how bing, to name one example, is a big consumer of CoreCLR and contributes and pushes GC and JIT improvement with various fixes/proposals which end up pushing perf up in small marginal improvements? How are they being affected by this supposed 5%+ perf hit (that is if you take Intel's word as-is)? |
@brianrob any update on this? |
@AndyAyersMS, thanks for pinging this. I am putting together the results to post here soon. |
FYI, I just posted #35730 with the results of the investigation. |
What
In continuation of #13794, It might be worthwhile to consider applying the same fixes suggested by intel in their erratum:
https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf
To the code generated by the JIT.
It is hard for me, personally to evaluate the impact on the performance of generated code produced by the JIT, but my very small anecdotal experience shows anywhere from 1%-2% degradation for my own code, to potential 20% degradation in very specific instances.
This is in accordance with Intel's own guidance (Section 2.2):
Affected opcodes
The update affects all forms of jumps, direct and indirect when the instruction itself is either ending or crossing on a 32-byte boundary (Section 2.1):
Proposed mitigation for code-generation
The proposed fix by intel is to stuff the affected opcodes:
with 0x2e prefix (End of section 2.4.0):
Affected processors
Section 4.0 specifically lists no less than 40 Intel CPUs, covering the breadth of their product line, that are affected by this update:
Proposed fix to the JIT
I personally hold the position that this isn't a small issue (even if we take intel's claim of 0%-4% as-is) and that it is not going away: neither in terms of performance impact nor when considering the breadth of the product line affected by this issue.
I think that it would be the desired behavior, from the standpoint of the JIT, to detect if this microcode update is applied or if the CPU model is from the affected family and generate code that would not trigger the degradation.
On the face of it, while AMD processors are completely unaffected by this erratum, they also seem to not suffer adverse effects when running code that was generated by patched up assemblers:
https://www.phoronix.com/scan.php?page=news_item&px=AMD-With-Intel-JCC-Assembler
category:cq
theme:alignment
skill-level:intermediate
cost:medium
The text was updated successfully, but these errors were encountered: