-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-Wenum-constexpr-conversion should be a hard error, not a downgradable error #59036
Comments
https://clang.llvm.org/docs/ReleaseNotes.html#potentially-breaking-changes (for Clang 16) says it's an error by default but is currently downgradable (until Clang 17). So I guess this bug is to track remembering to do so in Clang 17. See also #50055. |
Nice, I wasn't aware of that, thanks! Yes that makes a lot of sense, thanks for pointing out. Let's keep this bug as a reminder. I will open a new ticket for having a warning that works also on non-constexpr contexts. |
Did this happen? |
This fixes the following Clang 16 compilation error: kdevelop/plugins/clang/duchain/cursorkindtraits.h:217:7: error: integer value -1 is outside the valid range of values [0, 255] for the enumeration type 'CommonIntegralTypes' [-Wenum-constexpr-conversion] : static_cast<IntegralType::CommonIntegralTypes>(-1); Quote from llvm/llvm-project#59036 : The -Wenum-constexpr-conversion warning was created to account for the fact that casting integers to enums outside of the valid range of the enum is UB in C++17. Constant expressions invoking UB lead to an ill-formed program. BUG: 471995 FIXED-IN: 5.12.230800
Casting an int into an enum is undefined behavior if the int is outside of the range of the enum. UB is not allowed in constant expressions, therefore the compiler must produce a hard error in that case. However, until now, the compiler produced a warning that could be suppressed. It should instead be a hard error, since the program is ill-formed in that case, as per the C++ Standard. This patch turns the warning into an error. Additionally, references to the old warning are removed since they are now meaningless. Fixes llvm#59036
@AaronBallman @dwblaikie We are now on Clang 19, do you think it's the right time to pull the trigger and turn this into a non-downgradable error? It has been a warning also in system headers since Clang 18. |
Looks like binutils is still suppressing this warning: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=include/diagnostics.h;h=8cc2b493d2c02ba7dbc5879207746107ad7143a0;hb=refs/heads/master#l81 As is openmp in the LLVM project itself (
And I also found this At least for the openmp usage and probably the binutils usage, it'd be nice to see if some progress can be made removing those warning suppressions. |
I expect that so long as we leave it a downgradable diagnostic, people will continue to not fix their code because that's easier or because they're unaware of the issue. The OpenMP find is a wonderful example; it was disabled two years ago "to buy ourselves more time" and no attempts have been made to actually fix the issue since then: 9ff0cc7 That said, Clang 19 might be a bit too soon given that we enabled it in system headers in Clang 18, which isn't yet released. I think Clang 20/21 might be somewhat better because that gives folks another year to discover the problems with their uses, but at some point, I think we really do need to force the issue otherwise the can will be kicked down the road forever. |
FYI I'm bringing this issue to the binutils project. I agree with Aaron, it's way easier to just suppress the warning and call it a compiler bug than actually do some research and understand the subtle issues when doing these tricks with enums. I think that the more we delay it, the more projects will suppress the warning as they bump their clang installation, and so the harder it will be for us to introduce it. That said, I'm not opposed to waiting another release or two. |
This effectively reverts commit 9ff0cc7. For some reason "git revert" lead to "no changes" after fixing conflicts, so a clean revert was not possible. The original issue (llvm#57022) is no longer reproducible even with this patch, so we can remove the suppression. This is in line with our goal to make -Wenum-constexpr-conversion a non-downgradeable error, see llvm#59036.
This effectively reverts commit 9ff0cc7. For some reason "git revert" lead to "no changes" after fixing conflicts, so a clean revert was not possible. The original issue (#57022) is no longer reproducible even with this patch, so we can remove the suppression. This is in line with our goal to make -Wenum-constexpr-conversion a non-downgradeable error, see #59036. Co-authored-by: Carlos Gálvez <[email protected]>
I'm not a contributor to this fine project, so my influence is understandably small. If the LLVM maintainers value purity over usability and practicality, I respect that. Take the high road. It's your project. It's your passion. You decide your goals. But I don't agree with It sounds like I need to caution my customers who build on RedHat Linux or older Ubuntu LTS that the workaround will stop working soon and they should avoid using upcoming LLVM compilers unless they are also willing to upgrade or fix other required libs beyond the package manager versions. |
On Tuesday December 03 2024 07:27:27 Carlos Galvez wrote:
No, this is applied from C++11 (I think), since it's a Defect Report.
That seems like an arbitrary cut-off then?
See previous response: #59036 (comment)
Yes, this was discussed before but my opinion about it hasn't changed. Maybe if error reporting has improved but when I was confronted with a build failure (due to -Werror in my case) it was GCC that allowed me find and fix the culprit).
|
We're following the guidance from the standards committee; they thought this issue was sufficiently important to change its behavior as a defect report: https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1766
We're not breaking years-old code; we're diagnosing code that's been broken for years and was hoping the behavior from the compiler was reasonable. Enumerations in C++ are not the same as enumerations in C in that C++ enumerations use a notional power-of-two integer type as the underlying type of an unscoped enumeration, not (There is an interesting question there of whether this diagnostic should be tied to
If the changes are sufficiently disruptive, we can certainly walk them back and leave it as an error which can be downgraded to a warning for another release or two. However, there's not been much indication yet that this is significantly disruptive in practice. Are you finding otherwise? |
On Tuesday December 03 2024 08:33:44 mborgerding wrote:
It sounds like I need to caution my customers who build on RedHat Linux or older Ubuntu LTS that the workaround will stop working soon and they should avoid using upcoming LLVM compilers unless they are also willing to upgrade other required libs beyond the package manager.
They will presumable be using older hardware too? If so they'll probably see faster and less resource-hungry builds too by *not* using clang...
|
Has the upcoming unreleased version been disruptive in practice? Shockingly no. ;) Recent versions of intel LLVM-based icpx were slightly disruptive, but not too bad. There was a compiler flag workaround. |
Correct. The unreleased version of the compiler gets nontrivial amounts of testing in practice, so if this broke something significant, we're likely to hear about it before the final rc's get made. |
That could be an option. It might however open the door to allowing similar things with similar flags, e.g. don't error on signed integer overflow in constant expressions with |
Yeah, the behavior could end up inconsistent in ways that are similarly frustrating -- e.g., a library developer who compiles with the flag disabled (which it is by default) may not get any diagnostics, ship their library, and then a user who does enable the flag consumes the library header and gets errors they can't fix. So in terms of options I see currently, we have:
|
On Wednesday December 04 2024 00:13:21 Carlos Galvez wrote:
signed integer overflow in constant expressions with `-fwrapv`.
Doesn't overflow always simply discard the overflowing bits? Or are we still talking about C++ only here too, where apparently "thou shalt not depend on hardware tricks to implement a cheap self-resetting counter"?
|
Personally I'm happy with 1. I think option 2 is probably a good tradeoff, but ultimately is just delaying the problem.
No, this is simply undefined behavior, both in C and C++. Check the following example: the compiler will always emit "false" regardless of the input: |
On Wednesday December 04 2024 04:43:11 Aaron Ballman wrote:
2) don't implement as a DR -- it's still a hard error in Clang 20, but only in C++17 and up. In C++11/14, it would be a warning and users would lose the property of "no UB in a constant expression" in a known case. This would be based on user disruption from the changes. It may be seen as inconsistent by users because it's UB in all C++ language modes.
I know this was discussed before, but wasn't it declared UB only in C++17? That idea must come from somewhere, and clang++ has clearly managed to avoid miscompiling code offending this principle until now. Don't mistake "users" for compiler experts, I think the vast majority won't see an inconsistency when a standard is being followed to the letter (and not more than that).
After all the whole debate here is about old code that may not even follow C++11 .
You missed an option, I think:
4) drop support for obsolete C++ standards (arbitrarily defined as pre-C++17).
That would be perfectly consistent with LLVM's, erm, forward looking approach.
|
Yes and no. In C++14 and before, it was unspecified behavior rather than undefined behavior. Unspecified behavior means "you'll get a result but it doesn't have to make sense or be consistent", whereas undefined behavior means "you'll get what you get, no promises about anything". Both are indicative of the user's code being invalid. Additionally, there's two ways issues get resolved in WG21: either as a Defect Report (DR) or not. If an issue is not resolved as a DR, then the behavior is expected to change in that release going forward. If an issue is resolved as a DR, then the behavior is expected to change in all language modes where the issue could appear. So this was declared UB in C++17, but as a DR, which means the committee would like us to treat it as though it was always undefined rather than unspecified.
There are natural tensions here. Some people want existing code to work at all costs. Some people don't care at all about existing code if they can prove that code is "broken" somehow. Most folks are somewhere in between those two extremes. WG21 wants a perfect language which never has bugs, users want their existing code to continue to work but are fine with behavioral changes so long as it makes things faster, and implementers have to find ways to work with both sets of desires.
That is an option, but it's a non-starter because there is far too much old code that still needs to be compiled. For example, think about a Linux distribution and how many packages it builds -- many of those are C++98/03 and quite don't see regular updates, but still need to compile with the same compiler as other packages. |
On Wednesday December 04 2024 05:28:01 Carlos Galvez wrote:
No, this is simply undefined behavior, both in C and C++. Check the following example: the compiler will always emit "false" regardless of the input:
https://godbolt.org/z/3sjYTb7Y9
Seems to me that it shouldn't do anything else from a mathematical point of view ... I can't think of a finite value of x where `x+1` is not larger than `x` itself. But, C(++) isn't math...
However, that's only true for compiling with optimisation. Clang <= 12 (don't have any newer on the system I'm on) and GCC <=13 will have that `will_overflow` function return `true` for x=2147483647 . They will also do that *with* optimisation when using `const volatile y=x+1; return y < x;`.
FWIW, shouldn't at least GCC warn about the expression being UB when compiling with -Wsequence-point?
|
From my limited perspective ... my and my customers' needs would be addressed by the option that differentiates at c++17:
|
On Wednesday December 04 2024 05:57:16 Aaron Ballman wrote:
That is an option, but it's a non-starter because there is *far* too much old code that still needs to be compiled. For example, think about a Linux distribution and how many packages it builds -- many of those are C++98/03 and quite don't see regular updates, but still need to compile with the same compiler as other packages.
It was tongue in cheek of course, but...
Hah, consider how many people are also using older hardware that doesn't run a current version of whatever OS family they use (and how happily LLVM dropped official support for those) ...
How many Linux distributions even use clang as their system compiler? Plus, that code gotta be "fixed" sometime, no? O:^)
(I'm not really up to speed how C++ standards "work", but I'd presume they're incremental so most code that's a proper standard X will also conform to standard X+1?)
|
It would help me if I understood your needs better. I know what you want, but I don't know the severity of why you want it. Is there something preventing your customers from fixing their code? Is this impacting 1000s of customers? Something else? |
The problem occurs when it's not their code that needs fixing. It's the development libraries (e.g. boost) their applications build against, which are preferably provided by package manger versions. |
The Linux kernel itself, FreeBSD, Darwin, others...
The way I'd describe it is like this: The C committee works really hard to avoid breaking existing code when updating standards, but it still happens on occasion. The C++ committee knowingly breaks existing code when updating standards, but does so in an opinionated way that aims to limit the blast radius somewhat (e.g., they'll break "bad", "ugly", or "overly clever" code, but they won't break "important" code, where all of these are value judgements). |
Apologies if there were details elsewhere that I missed, but the issue is fixed in Boost. So if a user has a package manager which provides Clang 20 (releases mid 2025), wouldn't that package manager also already be providing the Boost (released in mid-2024) with the fix in it? |
On Wednesday December 04 2024 06:35:41 Aaron Ballman wrote:
The Linux kernel itself, FreeBSD, Darwin, others...
Is there even C++ code in the Linux kernel? I did try to build one with clang, but quickly discovered that wasn't a good idea (IIRC it ended up not having support for dynamically loaded extensions).
|
RedHat Enterprise Linux 9.5 is the newest. It ships with boost 1.75. The bug was not fixed until 1.81 I'm ambivalent which way you go. I appreciate that any pains from a hard error would be limited in scope and time; it would be cleaner in the long run to just "rip off the bandage". But it is still pain. Pain that could be avoided with either a workaround or limiting the hard error to >= c++17. On a related note, does anyone know how to tell upon which version of LLVM a particular intel icpx is based? |
I'm not certain where this discussion landed: https://www.phoronix.com/news/CPP-Linux-Kernel-2024-Discuss
You may want to check out https://www.kernel.org/doc/html/latest/kbuild/llvm.html |
Thanks, that's helpful!
Intel's downstream can (and does) carry patches to deviate from the upstream behavior, as with any of the other downstream consumers of Clang. In upstream, we try to pick behaviors that will minimize the need for downstreams to patch, so downstreams still do impact the decisions we make. |
On Wednesday December 04 2024 07:56:31 Aaron Ballman wrote:
You may want to check out https://www.kernel.org/doc/html/latest/kbuild/llvm.html
Thanks, pretty certain I did, but in the end the question is moot on the hardware I use. As I observed elsewhere in this ticket, current/recent GCC compilers are faster and less resource hungry than even an old clang like 12.0.1 (the newest I currently have on the Linux in question). Given that neither compiler wins across the board in terms of performance of the produced code (not surprisingly) the choice makes itself for me, esp. for large projects (including building the compilers themselves, where GCC is also a clear winner).
|
I did not easily find anything official, but from fiddling with godbolt, |
Related, Jonathan filed this issue in GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117963 |
This fixes PR 31331: https://sourceware.org/bugzilla/show_bug.cgi?id=31331 Currently, enum-flags.h is suppressing the warning -Wenum-constexpr-conversion coming from recent versions of Clang. This warning is intended to be made a compiler error (non-downgradeable) in future versions of Clang: llvm/llvm-project#59036 The rationale is that casting a value of an integral type into an enumeration is Undefined Behavior if the value does not fit in the range of values of the enum: https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1766 Undefined Behavior is not allowed in constant expressions, leading to an ill-formed program. In this case, in enum-flags.h, we are casting the value -1 to an enum of a positive range only, which is UB as per the Standard and thus not allowed in a constexpr context. The purpose of doing this instead of using std::underlying_type is because, for C-style enums, std::underlying_type typically returns "unsigned int". However, when operating with it arithmetically, the enum is promoted to *signed* int, which is what we want to avoid. This patch solves this issue as follows: * Use std::underlying_type and remove the custom enum_underlying_type. * Ensure that operator~ is called always on an unsigned integer. We do this by casting the input enum into std::size_t, which can fit any unsigned integer. We have the guarantee that the cast is safe, because we have checked that the underlying type is unsigned. If the enum had negative values, the underlying type would be signed. This solves the issue with C-style enums, but also solves a hidden issue: enums with underlying type of std::uint8_t or std::uint16_t are *also* promoted to signed int. Now they are all explicitly casted to the largest unsigned int type and operator~ is safe to use. * There is one more thing that needs fix. Currently, operator~ is implemented as follows: return (enum_type) ~underlying(e); After applying ~underlying(e), the result is a very large value, which we then cast to "enum_type". This cast is Undefined Behavior if the large value does not fit in the range of the enum. For C++ enums (scoped and/or with explicit underlying type), the range of the enum is the entire range of the underlying type, so the cast is safe. However, for C-style enums, the range is the smallest bit-field that can hold all the values of the enumeration. So the range is a lot smaller and casting a large value to the enum would invoke Undefined Behavior. To solve this problem, we create a new trait EnumHasFixedUnderlyingType, to ensure operator~ may only be called on C++-style enums. This behavior is roughly the same as what we had on trunk, but relying on different properties of the enums. * Once this is implemented, the following tests fail to compile: CHECK_VALID (true, int, true ? EF () : EF2 ()) This is because it expects the enums to be promoted to signed int, instead of unsigned int (which is the true underlying type). I propose to remove these tests altogether, because: - The comment nearby say they are not very important. - Comparing 2 enums of different type like that is strange, relies on integer promotions and thus hurts readability. As per comments in the related PR, we likely don't want this type of code in gdb code anyway, so there's no point in testing it. - Most importantly, this type of comparison will be ill-formed in C++26 for regular enums, so enum_flags does not need to emulate that. Since this is the only place where the warning was suppressed, remove also the corresponding macro in include/diagnostics.h. The change has been tested by running the entire gdb test suite (make check) and comparing the results (testsuite/gdb.sum) against trunk. No noticeable differences have been observed. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31331 Tested-by: Keith Seitz <[email protected]> Approved-By: Tom Tromey <[email protected]>
I am a little confused how the enum of unfixed type DR in C++17 and how that relates to C interoperability especially on Win32 APIs. On MSVC all enums of unfixed types have int as the underlying type. I am not worrying about the MSVC A lot of Win32/DirectX/etc APIs use enums for constants instead of macros like POSIX. Take the HeapQueryInformation Win32 API which uses the On Windows SDK 22621 that enum is defined as
On Windows SDK 19041 that enum is defined as
Having code such as
now fails to compile which I understand the rationale for from the DR in C++17. However from my understanding C allows any valid value that is representable by the underlying unfixed enum type.
If my understanding is correct one of the following would have to happen
In my toy example above I could just check I need to think a bit more about MSVC compat in clang here. Maybe we tie this to |
This fixes PR 31331: https://sourceware.org/bugzilla/show_bug.cgi?id=31331 Currently, enum-flags.h is suppressing the warning -Wenum-constexpr-conversion coming from recent versions of Clang. This warning is intended to be made a compiler error (non-downgradeable) in future versions of Clang: llvm/llvm-project#59036 The rationale is that casting a value of an integral type into an enumeration is Undefined Behavior if the value does not fit in the range of values of the enum: https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1766 Undefined Behavior is not allowed in constant expressions, leading to an ill-formed program. In this case, in enum-flags.h, we are casting the value -1 to an enum of a positive range only, which is UB as per the Standard and thus not allowed in a constexpr context. The purpose of doing this instead of using std::underlying_type is because, for C-style enums, std::underlying_type typically returns "unsigned int". However, when operating with it arithmetically, the enum is promoted to *signed* int, which is what we want to avoid. This patch solves this issue as follows: * Use std::underlying_type and remove the custom enum_underlying_type. * Ensure that operator~ is called always on an unsigned integer. We do this by casting the input enum into std::size_t, which can fit any unsigned integer. We have the guarantee that the cast is safe, because we have checked that the underlying type is unsigned. If the enum had negative values, the underlying type would be signed. This solves the issue with C-style enums, but also solves a hidden issue: enums with underlying type of std::uint8_t or std::uint16_t are *also* promoted to signed int. Now they are all explicitly casted to the largest unsigned int type and operator~ is safe to use. * There is one more thing that needs fix. Currently, operator~ is implemented as follows: return (enum_type) ~underlying(e); After applying ~underlying(e), the result is a very large value, which we then cast to "enum_type". This cast is Undefined Behavior if the large value does not fit in the range of the enum. For C++ enums (scoped and/or with explicit underlying type), the range of the enum is the entire range of the underlying type, so the cast is safe. However, for C-style enums, the range is the smallest bit-field that can hold all the values of the enumeration. So the range is a lot smaller and casting a large value to the enum would invoke Undefined Behavior. To solve this problem, we create a new trait EnumHasFixedUnderlyingType, to ensure operator~ may only be called on C++-style enums. This behavior is roughly the same as what we had on trunk, but relying on different properties of the enums. * Once this is implemented, the following tests fail to compile: CHECK_VALID (true, int, true ? EF () : EF2 ()) This is because it expects the enums to be promoted to signed int, instead of unsigned int (which is the true underlying type). I propose to remove these tests altogether, because: - The comment nearby say they are not very important. - Comparing 2 enums of different type like that is strange, relies on integer promotions and thus hurts readability. As per comments in the related PR, we likely don't want this type of code in gdb code anyway, so there's no point in testing it. - Most importantly, this type of comparison will be ill-formed in C++26 for regular enums, so enum_flags does not need to emulate that. Since this is the only place where the warning was suppressed, remove also the corresponding macro in include/diagnostics.h. The change has been tested by running the entire gdb test suite (make check) and comparing the results (testsuite/gdb.sum) against trunk. No noticeable differences have been observed. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31331 Tested-by: Keith Seitz <[email protected]> Approved-By: Tom Tromey <[email protected]> (cherry picked from commit 4a0b2cb) Change-Id: I5a9918859deca5f219c20d3a7215fa5adf762d91
The
-Wenum-constexpr-conversion
warning was created to account for the fact that casting integers to enums outside of the valid range of the enum is UB in C++17. Constant expressions invoking UB lead to an ill-formed program.The current behavior of Clang is that it does warn and it does error, but the message is nevertheless a warning that the user can easily suppress via
-Wno-enum-constexpr-conversion
.Since the program is ill-formed, I don't think this should be a warning that can be disabled. Just like one can't disable the warning in this ill-formed code:
Therefore I propose to make the diagnostic a hard error that cannot be disabled. Additionally, this "hard error" behavior should only be active in C++17 (it's currently active in for any standard).
The warning could be repurposed to be a general warning that one can enable, and is active not only in
constexpr
expressions, but in any expression that involves casting an integer to an enum.The text was updated successfully, but these errors were encountered: