-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use intrinsic for genLog2() #69333
Use intrinsic for genLog2() #69333
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsAfter recent VC++ upgrade, we saw a bad codegen C++ bug that repros only on release build of windows/arrm64. Explanation is in #69222 (comment). To get around the bug, use intrinsic Here is the code that is generated by various compilers/archs for BitScanForward: Also using Fixes: #69222
|
@dotnet/jit-contrib |
// - Taking the modulo of 'value' with this will produce a unique hash for all | ||
// powers of 2 (which is what "value" is). | ||
// - Entries in hashTable[] which are -1 should never be used. There | ||
// should be PRIME-8*sizeof(value) entries which are -1 . | ||
// | ||
inline | ||
unsigned BitPosition(unsigned value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, but its worth calling out that this is "effectively" doing TrailingZeroCount
, but without handling 0
.
It might be worth centralizing these "primitive bit operation" methods somewhere and ensuring they are generally reusable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened tracking issue #69401
*/ | ||
|
||
inline unsigned int genFindHighestBit(unsigned int mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. LGTM but this is effectively using LeadingZeroCount
(and then masking out everything but the highest bit) but without strictly handling some of the potential edge cases.
It might be worthwhile to centralize these somewhere (maybe as an "up for grabs" work item)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes LGTM, but some of these methods are "odd" and seemingly restrictive for no reason compared to a more standardized implementation.
We have existing accelerated implementations on the C# side that could be reused in native: https://source.dot.net/#System.Private.CoreLib/BitOperations.cs,af96b203ee2f79f7
Such implementations also handle various edge cases (still optimally) and make the semantics of code clearer IMO. It might be worth an up-for-grabs work item to consider improving some of these more generally outside the necessary bug workaround
Interesting to note that this saves 0.5% to 1% TP when cross compiling to arm64, while there is no TP diff when compiling for x64. Presumably that's because we only use 64-bit [edit] it does save significantly on TP on win-x86 too though, ~0.3%. |
/backport to release/6.0 |
Not sure what is the right text here. @jakobbotsch - do you know? |
AFAIK you got it right... Maybe it is because the thread is locked? I guess you have to open it manually. |
After recent VC++ upgrade, we saw a bad codegen C++ bug that repros only on release build of windows/arrm64. Explanation is in #69222 (comment). To get around the bug, use intrinsic
BitForward64
. I also simplified the code inBitPosition
to always useBitScanForward()
.Here is the code that is generated by various compilers/archs for BitScanForward:
vc++: https://godbolt.org/z/h4e7MqMsE
clang/gcc: https://godbolt.org/z/jWc6YPGd1
Also using
_BitScanReverse()
methods ingenFindHighestBit()
(although they are not used by anyone).Fixes: #69222