Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold (icmp pred (trunc nuw/nsw X), C) -> (icmp pred X, (zext/sext C)) #87935

Closed

Conversation

goldsteinn
Copy link
Contributor

@goldsteinn goldsteinn commented Apr 7, 2024

  • [InstCombine] Add tests for folding (icmp pred (trunc nuw/nsw X), C); NFC
  • [InstCombine] Fold (icmp pred (trunc nuw/nsw X), C) -> (icmp pred X, (zext/sext C))

This is valid as long as the sign of the wrap flag doesn't differ from
the sign of the pred.

Proofs: https://alive2.llvm.org/ce/z/mxWoFd

NB: The online Alive2 hasn't been updated with trunc nuw/nsw
support, so the proofs must be reproduced locally.

@goldsteinn goldsteinn requested a review from nikic as a code owner April 7, 2024 19:10
@goldsteinn goldsteinn requested review from DeepFlyingSky, antoniofrighetto and dtcxzyw and removed request for DeepFlyingSky April 7, 2024 19:11
@llvmbot
Copy link
Member

llvmbot commented Apr 7, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (goldsteinn)

Changes
  • [InstCombine] Add tests for folding (icmp pred (trunc nuw/nsw X), C); NFC
  • [InstCombine] Fold (icmp pred (trunc nuw/nsw X), C) -> (icmp pred X, (zext/sext C))

Full diff: https://github.com/llvm/llvm-project/pull/87935.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+16)
  • (modified) llvm/test/Transforms/InstCombine/icmp-trunc.ll (+144)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index db302d7e526844..6a3f5ba7b0c1e7 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -5705,6 +5705,22 @@ Instruction *InstCombinerImpl::foldICmpWithTrunc(ICmpInst &ICmp) {
   ICmpInst::Predicate Pred = ICmp.getPredicate();
   Value *Op0 = ICmp.getOperand(0), *Op1 = ICmp.getOperand(1);
 
+  // Match (icmp pred (trunc nuw/nsw X), C)
+  // Which we can convert to (icmp pred X, (sext/zext C))
+  // TODO: Maybe this makes sense as a general canonicalization?
+  if (match(Op1, m_ImmConstant())) {
+    if (auto *TI = dyn_cast<TruncInst>(Op0)) {
+      Value *ExtOp0 = TI->getOperand(0);
+      Value *ExtOp1 = nullptr;
+      if (!ICmp.isSigned() && TI->hasNoUnsignedWrap())
+        ExtOp1 = Builder.CreateZExt(Op1, ExtOp0->getType());
+      else if (!ICmp.isUnsigned() && TI->hasNoSignedWrap())
+        ExtOp1 = Builder.CreateSExt(Op1, ExtOp0->getType());
+      if (ExtOp1)
+        return new ICmpInst(Pred, ExtOp0, ExtOp1);
+    }
+  }
+
   // Try to canonicalize trunc + compare-to-constant into a mask + cmp.
   // The trunc masks high bits while the compare may effectively mask low bits.
   Value *X;
diff --git a/llvm/test/Transforms/InstCombine/icmp-trunc.ll b/llvm/test/Transforms/InstCombine/icmp-trunc.ll
index b2de9dddb21947..13f83ac5db7798 100644
--- a/llvm/test/Transforms/InstCombine/icmp-trunc.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-trunc.ll
@@ -555,3 +555,147 @@ define i1 @shl1_trunc_sgt4(i32 %a) {
   %r = icmp sgt i16 %t, 4
   ret i1 %r
 }
+
+define i1 @eq_nuw(i32 %x) {
+; DL64-LABEL: @eq_nuw(
+; DL64-NEXT:    [[TMP1:%.*]] = and i32 [[X:%.*]], 255
+; DL64-NEXT:    [[R:%.*]] = icmp eq i32 [[TMP1]], 123
+; DL64-NEXT:    ret i1 [[R]]
+;
+; DL8-LABEL: @eq_nuw(
+; DL8-NEXT:    [[R:%.*]] = icmp eq i32 [[X:%.*]], 123
+; DL8-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nuw i32 %x to i8
+  %r = icmp eq i8 %t, 123
+  ret i1 %r
+}
+
+define i1 @ult_nuw(i32 %x) {
+; CHECK-LABEL: @ult_nuw(
+; CHECK-NEXT:    [[R:%.*]] = icmp ult i32 [[X:%.*]], 45
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nuw i32 %x to i8
+  %r = icmp ult i8 %t, 45
+  ret i1 %r
+}
+
+define i1 @ule_nuw(i32 %x) {
+; CHECK-LABEL: @ule_nuw(
+; CHECK-NEXT:    [[T:%.*]] = trunc nuw i32 [[X:%.*]] to i8
+; CHECK-NEXT:    [[R:%.*]] = icmp ult i32 [[X]], 46
+; CHECK-NEXT:    call void @use(i8 [[T]])
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nuw i32 %x to i8
+  %r = icmp ule i8 %t, 45
+  call void @use(i8 %t)
+  ret i1 %r
+}
+
+define i1 @ugt_nuw(i32 %x) {
+; CHECK-LABEL: @ugt_nuw(
+; CHECK-NEXT:    [[R:%.*]] = icmp ugt i32 [[X:%.*]], 12
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nuw i32 %x to i8
+  %r = icmp ugt i8 %t, 12
+  ret i1 %r
+}
+
+define i1 @uge_nuw(i48 %x) {
+; CHECK-LABEL: @uge_nuw(
+; CHECK-NEXT:    [[T:%.*]] = trunc nuw i48 [[X:%.*]] to i8
+; CHECK-NEXT:    [[R:%.*]] = icmp ugt i48 [[X]], 98
+; CHECK-NEXT:    call void @use(i8 [[T]])
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nuw i48 %x to i8
+  %r = icmp uge i8 %t, 99
+  call void @use(i8 %t)
+  ret i1 %r
+}
+
+define i1 @sgt_nuw_fail(i32 %x) {
+; CHECK-LABEL: @sgt_nuw_fail(
+; CHECK-NEXT:    [[T:%.*]] = trunc nuw i32 [[X:%.*]] to i8
+; CHECK-NEXT:    [[R:%.*]] = icmp sgt i8 [[T]], 12
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nuw i32 %x to i8
+  %r = icmp sgt i8 %t, 12
+  ret i1 %r
+}
+
+define i1 @ne_nsw(i32 %x) {
+; DL64-LABEL: @ne_nsw(
+; DL64-NEXT:    [[TMP1:%.*]] = and i32 [[X:%.*]], 255
+; DL64-NEXT:    [[R:%.*]] = icmp ne i32 [[TMP1]], 123
+; DL64-NEXT:    ret i1 [[R]]
+;
+; DL8-LABEL: @ne_nsw(
+; DL8-NEXT:    [[R:%.*]] = icmp ne i32 [[X:%.*]], 123
+; DL8-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nsw i32 %x to i8
+  %r = icmp ne i8 %t, 123
+  ret i1 %r
+}
+
+define i1 @slt_nsw(i32 %x) {
+; CHECK-LABEL: @slt_nsw(
+; CHECK-NEXT:    [[R:%.*]] = icmp slt i32 [[X:%.*]], 45
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nsw i32 %x to i8
+  %r = icmp slt i8 %t, 45
+  ret i1 %r
+}
+
+define i1 @sle_nsw(i32 %x) {
+; CHECK-LABEL: @sle_nsw(
+; CHECK-NEXT:    [[T:%.*]] = trunc nsw i32 [[X:%.*]] to i8
+; CHECK-NEXT:    [[R:%.*]] = icmp slt i32 [[X]], 46
+; CHECK-NEXT:    call void @use(i8 [[T]])
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nsw i32 %x to i8
+  %r = icmp sle i8 %t, 45
+  call void @use(i8 %t)
+  ret i1 %r
+}
+
+define i1 @sgt_nsw(i32 %x) {
+; CHECK-LABEL: @sgt_nsw(
+; CHECK-NEXT:    [[R:%.*]] = icmp sgt i32 [[X:%.*]], 12
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nsw i32 %x to i8
+  %r = icmp sgt i8 %t, 12
+  ret i1 %r
+}
+
+define i1 @sge_nsw(i48 %x) {
+; CHECK-LABEL: @sge_nsw(
+; CHECK-NEXT:    [[T:%.*]] = trunc nsw i48 [[X:%.*]] to i8
+; CHECK-NEXT:    [[R:%.*]] = icmp sgt i48 [[X]], 98
+; CHECK-NEXT:    call void @use(i8 [[T]])
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nsw i48 %x to i8
+  %r = icmp sge i8 %t, 99
+  call void @use(i8 %t)
+  ret i1 %r
+}
+
+define i1 @ule_nsw_fail(i32 %x) {
+; CHECK-LABEL: @ule_nsw_fail(
+; CHECK-NEXT:    [[T:%.*]] = trunc nsw i32 [[X:%.*]] to i8
+; CHECK-NEXT:    [[R:%.*]] = icmp ult i8 [[T]], 46
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %t = trunc nsw i32 %x to i8
+  %r = icmp ule i8 %t, 45
+  ret i1 %r
+}

@goldsteinn goldsteinn changed the title goldsteinn/trunc nuw nsw icmp fold [InstCombine] Fold (icmp pred (trunc nuw/nsw X), C) -> (icmp pred X, (zext/sext C)) Apr 7, 2024
@goldsteinn
Copy link
Contributor Author

Think long term, we can probably generalize just about any fold we where previously doing with sext to work for trunc nsw, zext to work with trunc nuw, and zext nneg to work with trunc nuw nsw.

@nikic
Copy link
Contributor

nikic commented Apr 9, 2024

Generally fine, but before we start using trunc nuw/nsw in optimizations we need to perform some due diligence. Based on past experience, we very likely have folds that incorrectly preserve the nuw/nsw flag now.

Approach is basically 1. replace "trunc" with "trunc nuw" in existing tests, 2. check whether there are failures with alive2, 3. repeat with trunc nsw. Let me know if you would be interested in doing that...

@goldsteinn
Copy link
Contributor Author

Generally fine, but before we start using trunc nuw/nsw in optimizations we need to perform some due diligence. Based on past experience, we very likely have folds that incorrectly preserve the nuw/nsw flag now.

Approach is basically 1. replace "trunc" with "trunc nuw" in existing tests, 2. check whether there are failures with alive2, 3. repeat with trunc nsw. Let me know if you would be interested in doing that...

Ill look into that. I would think the only way this really becomes an issue is if we have replaceOperand on a trunc. Any other use case we would erroneously preserve flags?

@nikic
Copy link
Contributor

nikic commented Apr 10, 2024

Generally fine, but before we start using trunc nuw/nsw in optimizations we need to perform some due diligence. Based on past experience, we very likely have folds that incorrectly preserve the nuw/nsw flag now.
Approach is basically 1. replace "trunc" with "trunc nuw" in existing tests, 2. check whether there are failures with alive2, 3. repeat with trunc nsw. Let me know if you would be interested in doing that...

Ill look into that. I would think the only way this really becomes an issue is if we have replaceOperand on a trunc. Any other use case we would erroneously preserve flags?

It tends to be either in-place modification, or returning of existing values. Here's a couple examples of issues I ran into with disjoint or: d8bc546 07c18a0 cd31cf5 We had similar with zext nneg. Most likely there will be some for trunc nuw/nsw as well.

@nikic
Copy link
Contributor

nikic commented Apr 29, 2024

I went ahead and did this myself -- and didn't find any issues this time around. Maybe we don't have that many interesting trunc combines...

Value *ExtOp1 = nullptr;
if (!ICmp.isSigned() && TI->hasNoUnsignedWrap())
ExtOp1 = Builder.CreateZExt(Op1, ExtOp0->getType());
else if (!ICmp.isUnsigned() && TI->hasNoSignedWrap())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually also valid for isUnsigned() predicates: https://alive2.llvm.org/ce/z/MiDiYZ

@goldsteinn goldsteinn force-pushed the goldsteinn/trunc-nuw-nsw-icmp-fold branch from 42fc67c to 3f3b660 Compare May 2, 2024 20:00
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks basically good to me, the only question I have is whether this should be behind a shouldChangeType check, similar to the other trunc+icmp combine?

If we have a trunc i128 to i8 or so, it's not clear that converting from an icmp i8 to icmp i128 is a good transform. (Similar for the i48 example in the tests.)

@goldsteinn
Copy link
Contributor Author

This looks basically good to me, the only question I have is whether this should be behind a shouldChangeType check, similar to the other trunc+icmp combine?

If we have a trunc i128 to i8 or so, it's not clear that converting from an icmp i8 to icmp i128 is a good transform. (Similar for the i48 example in the tests.)

Yeah, think shouldChangeType makes sense. Ill update.

@goldsteinn goldsteinn force-pushed the goldsteinn/trunc-nuw-nsw-icmp-fold branch from 3f3b660 to 1f86ade Compare May 3, 2024 03:54

// Match (icmp pred (trunc nuw/nsw X), C)
// Which we can convert to (icmp pred X, (sext/zext C))
if (isDesirableIntType(SrcBits) || shouldChangeType(DstBits, SrcBits)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both isDesirableIntType and shouldChangeType here?

@goldsteinn
Copy link
Contributor Author

ping

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request May 11, 2024
@dtcxzyw
Copy link
Member

dtcxzyw commented May 11, 2024

Approach is basically 1. replace "trunc" with "trunc nuw" in existing tests, 2. check whether there are failures with alive2, 3. repeat with trunc nsw. Let me know if you would be interested in doing that...

IIRC @regehr is developing a fuzzer that mutates existing instcombine tests :)

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. There are some regressions. But I don't think we should block this patch.

if (!Cmp.isSigned() && Trunc->hasNoUnsignedWrap())
return new ICmpInst(Pred, X, ConstantInt::get(SrcTy, C.zext(SrcBits)));
if (Trunc->hasNoSignedWrap())
return new ICmpInst(Pred, X, ConstantInt::get(SrcTy, C.sext(SrcBits)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the sign-extended constant as it may be cheaper to materialize than the zero-extended one (e.g., -1 vs 4294967296).

Don't do this until we have some performance data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, ill leave as a potential todo for now.

@nikic
Copy link
Contributor

nikic commented May 11, 2024

There are some regressions. But I don't think we should block this patch.

What would likely avoid regressions would be to limit this to one-use truncs. I assume that the main problem is that it's hard to infer facts about the truncated value after conditions have been changed to use the non-truncated value.

@nikic
Copy link
Contributor

nikic commented May 11, 2024

By the way, I don't seem to be getting notifications when you force-push a PR, so I usually don't notice when you address feedback in PRs. Which is kind of weird, because I receive stupendous amounts of force-push spam from AtariDream's pull requests. I don't really get what the difference is.


// Match (icmp pred (trunc nuw/nsw X), C)
// Which we can convert to (icmp pred X, (sext/zext C))
if (shouldChangeType(DstBits, SrcBits)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be passing the types, not the bit width? That way we will not transform vector cases, for which we don't have sensible profitability heuristics here.

Copy link
Contributor Author

@goldsteinn goldsteinn May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, in this case the shouldChnageType that accepts types just forwards the primitiveBitWidth to the bitwidth helper, but if you think this version is a more future proof happy to change.
Edit: Misread the Type * impl.

At least IMO, handling Vecs is beneficial. If we have no profitability heuristic think simpler IR is a fair tiebreaker.

Copy link
Contributor

@nikic nikic May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really true for vectors. If the previous trunc + icmp sequence worked on legal types and you convert it into one that works on illegal types, that may be catastrophically worse (down to scalarization). There's a reason the special handling for vectors is there.

(This is especially true in the multi-use case, where you're not even removing the trunc.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a trivial example for aarch64: https://llvm.godbolt.org/z/K3hh3bx65 Even in the one-use case, getting rid of the trunc is worse for codegen. I haven't checked, but I expect your patch would currently do this transform because i32 is legal but i8 is not (for scalars).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@regehr
Copy link
Contributor

regehr commented May 11, 2024

IIRC @regehr is developing a fuzzer that mutates existing instcombine tests :)

it is my student Yuyou who has written this, but I have had a hand in it, and I use it often. if anyone else wants to run it, build this branch of Alive2:

https://github.com/Hatsunespica/alive2/tree/mutate

it can be used as a standalone mutator, but also it can generate a mutant and analyze it using Alive2 without forking, which makes it very fast. I'm mostly fuzzing the AArch64 backend these days, so if anyone wants to do middle-end fuzzing, that would be good!

@goldsteinn
Copy link
Contributor Author

There are some regressions. But I don't think we should block this patch.

What would likely avoid regressions would be to limit this to one-use truncs. I assume that the main problem is that it's hard to infer facts about the truncated value after conditions have been changed to use the non-truncated value.

NB, part of this is canonicalization imo. Original motivation was if we should extend the trunc handling in computeKnownBitsFromICmpCond

@goldsteinn
Copy link
Contributor Author

By the way, I don't seem to be getting notifications when you force-push a PR, so I usually don't notice when you address feedback in PRs. Which is kind of weird, because I receive stupendous amounts of force-push spam from AtariDream's pull requests. I don't really get what the difference is.

Hmm, would it help if I add a comment when I force-push?

@goldsteinn goldsteinn force-pushed the goldsteinn/trunc-nuw-nsw-icmp-fold branch from e6e5f52 to 0344081 Compare May 11, 2024 17:07
…X, (zext/sext C))`

This is valid as long as the sign of the wrap flag doesn't differ from
the sign of the `pred`.

Proofs: https://alive2.llvm.org/ce/z/35NsrR

NB: The online Alive2 hasn't been updated with `trunc nuw/nsw`
support, so the proofs must be reproduced locally.
@goldsteinn goldsteinn force-pushed the goldsteinn/trunc-nuw-nsw-icmp-fold branch from 0344081 to 5b4e99e Compare May 12, 2024 21:12
@goldsteinn
Copy link
Contributor Author

goldsteinn commented May 15, 2024

By the way, I don't seem to be getting notifications when you force-push a PR, so I usually don't notice when you address feedback in PRs. Which is kind of weird, because I receive stupendous amounts of force-push spam from AtariDream's pull requests. I don't really get what the difference is.

Hmm, would it help if I add a comment when I force-push?

@nikic, without trying to sound passive aggressive or rushing.

Do you get notifications when I comment in a PR, or are those lost like force-push notifications?

edit: Wondering if I'm going to need to start tagging you.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic
Copy link
Contributor

nikic commented May 15, 2024

@goldsteinn I do get the comments. I also got a notification for your last force-push. But not for the preceding one, for example.

@vitalybuka
Copy link
Collaborator

This patch causing false MSAN reports.
It's likely a bug in MSAN, but I didn't figured out the fix yet.

Would it be OK to revert the patch while I am investigating, to avoid false reports for us and other MSAN users?

@goldsteinn
Copy link
Contributor Author

This patch causing false MSAN reports. It's likely a bug in MSAN, but I didn't figured out the fix yet.

As in breaks MSAN, or MSAN now complains about LLVM?

Would it be OK to revert the patch while I am investigating, to avoid false reports for us and other MSAN users?

I guess preferably not. Can you wait until there have some actual reports / an issue justifying.

@vitalybuka
Copy link
Collaborator

This patch causing false MSAN reports. It's likely a bug in MSAN, but I didn't figured out the fix yet.

As in breaks MSAN, or MSAN now complains about LLVM?

Msan reports false bugs in users code.

Would it be OK to revert the patch while I am investigating, to avoid false reports for us and other MSAN users?

I guess preferably not. Can you wait until there have some actual reports / an issue justifying.

I have multiple reports on our internal codebase, I just need to minimize them. Inconveniently, I am traveling this week.
Also as soon I have reduced reproducer I will likely have a msan patch. Usually it's just over aggressive handling of some instruction which need to be relaxed.

Also if we wait, it can be branched into clang RC, which will need cherry-picks, potentially with conflicts.

@goldsteinn
Copy link
Contributor Author

This patch causing false MSAN reports. It's likely a bug in MSAN, but I didn't figured out the fix yet.

As in breaks MSAN, or MSAN now complains about LLVM?

Msan reports false bugs in users code.

Would it be OK to revert the patch while I am investigating, to avoid false reports for us and other MSAN users?

I guess preferably not. Can you wait until there have some actual reports / an issue justifying.

I have multiple reports on our internal codebase, I just need to minimize them. Inconveniently, I am traveling this week. Also as soon I have reduced reproducer I will likely have a msan patch. Usually it's just over aggressive handling of some instruction which need to be relaxed.

Also if we wait, it can be branched into clang RC, which will need cherry-picks, potentially with conflicts.

The thing is, we want this patch (from a codegen POV). It is necessary to cleanup regressions otherwise assosiated w/ #90436.
I understand you want a clean MSAN run, but tools like this are meant to be helpful, not impeding. If there is a bug in the tool, it doesn't seem so helpful in this case.
Is there any way to disable MSAN for this region (or just disable this fold if msan is enabled) as a stopgap?
If not, ill defer to @nikic, if he thinks the regressions w.o this are acceptable, then revert, but please keep me updated on the progress of the fix.

@nikic
Copy link
Contributor

nikic commented May 24, 2024

I think it's okay to revert in the meantime. Though you might want to check whether #93163 has fixed the msan issue -- I feel like it may be related (incorrect inference of trunc nuw/nsw for values that may be undef).

Also if we wait, it can be branched into clang RC, which will need cherry-picks, potentially with conflicts.

This is still two months away, so that's unlikely :)

@vitalybuka
Copy link
Collaborator

#93163 didn't help.

I understand you want a clean MSAN run, but tools like this are meant to be helpful, not impeding. If there is a bug in the tool, it doesn't seem so helpful in this case.

I see it in a different way.
Clang is a software product, msan is an established feature of the product, and users rely on it. New optimization triggers a pre-existed bug, which regresses existing features. Even if the optimization is correct on it's own, it's still regression of the product. Usually it's up to the author to make sure there is no regression to existing features (though it can be just reverting and asking maintainers for help). In rare cases it's impossible to make both features work correctly and trade-off is needed, but there is no evidence that this is the case.

With a reproducer it would be totally fair to revert this patch. However it's often better to revert early with just a promise to publish a reproducer slightly later.

I guess I will have time in the next 24h to investigate. I am still open to 3 possibilities:

  1. Patch is wrong
  2. Msan is wrong (most likely)
  3. User's code has UB

I will get back with the result soon.

@vitalybuka
Copy link
Collaborator

vitalybuka commented May 27, 2024

trivial reproducer https://godbolt.org/z/TaxWPPrdq

before the patch bit field was completely isolated before comparison, now it just compares with different constants:
https://godbolt.org/z/Y54hr7oer

Irrelevant bits of comparison arg are uninitialized. However Msan can't handle it for signed, but unsigned version is OK: https://godbolt.org/z/dMGeqaGEj

@vitalybuka
Copy link
Collaborator

I guess we can fix forward by enabling -msan-handle-icmp-exact. it's used for unsigned already.

FYI @eugenis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants