Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for cmp::minmax{_by,_by_key} #115939

Open
1 of 3 tasks
WaffleLapkin opened this issue Sep 18, 2023 · 17 comments
Open
1 of 3 tasks

Tracking Issue for cmp::minmax{_by,_by_key} #115939

WaffleLapkin opened this issue Sep 18, 2023 · 17 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@WaffleLapkin
Copy link
Member

WaffleLapkin commented Sep 18, 2023

Feature gate: #![feature(cmp_minmax)]

This is a tracking issue for core::cmp::minmax{_by,_by_key}.

Those functions take two values and return an array of [min, max]. This is essentially a sort specialized for two arguments.

Public API

// core::cmp

pub fn minmax<T>(v1: T, v2: T) -> [T; 2]
where
    T: Ord;

pub fn minmax_by<T, F>(v1: T, v2: T, compare: F) -> [T; 2]
where
    F: FnOnce(&T, &T) -> Ordering;

pub fn minmax_by_key<T, F, K>(v1: T, v2: T, f: F) -> [T; 2]
where
    F: FnMut(&T) -> K,
    K: Ord;

Steps / History

Unresolved Questions

  • None yet.

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@WaffleLapkin WaffleLapkin added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Sep 18, 2023
@scottmcm
Copy link
Member

Big fan. min/max always discard one of the arguments, so they're often useless with owned non-Copy things. This API fixes that by returning both.

@Jules-Bertholet
Copy link
Contributor

Looks like minmax_by_key would have the same issue as #34162?

@WaffleLapkin
Copy link
Member Author

@Jules-Bertholet yes, of course. and the fix would be the same (either using HKT, which is unergonomic in rust, or having a separate function for F: FnMut(&T) -> &impl ?Sized + Ord, or adding a bound on function return, such that we are adding a bound on something that is inside the binder)

@timvermeulen
Copy link
Contributor

I knew this looked familiar 🙂 #73857
Obviously a +1 from me, this concept is useful surprisingly often. A less practical reason why I like this is that it's the only min/max-like functionality that would work with hypothetical undroppable types.

@anko
Copy link
Contributor

anko commented Dec 23, 2023

Concern: Why -> [T; 2] over -> (T, T)?

I can't find precedent in std for either, so this may become the precedent that future functions returning "a pair of Ts" would want to be consistent with.


Informal survey

  • Searching for occurrences of the alternatives in this repo:

    • "-> [T; 2]"1 file: the current implementations of these functions.
    • "-> (T, T)"19 files. They are all in tests; not exposed API. Only one appears to be a test for tuples specifically; the rest are "spontaneously arising".
  • Searching for occurrences of the alternatives everywhere on Github, with Rust-only filter:

  • I also searched for "fn min_max(". In the 5 pages of results that Github will show me, I saw many implementations of similar functions returning a tuple of 2 items. I saw no implementations returning an array of length 2.

  • @timvermeulen's earlier similar PR (Add comparison functions that return both the min and the max value #73857) also used -> (T, T).


I believe these results indicate -> (T, T) is preferred, especially in this context.

By the principle of least surprise, I propose -> (T, T).

@scottmcm
Copy link
Member

scottmcm commented Dec 23, 2023

It used to be that the tuple was far better because it worked in patterns.

Now that let [a, b] = cmp::minmax(x, y); works, though, using the array is more reasonable than it used to be.

@WaffleLapkin
Copy link
Member Author

@anko as @scottmcm said, a lot of this is probably historic reasons — addition of matching / destructuring arrays is relatively new (checks notes, apparently it was stabilized in Rust 1.42, not that new).

I prefer and have used an array in those signatures, because I think it's nicer and better shows the intent.

It's nicer in the sense that arrays are more powerful than tuples — since they are homogenous they allow for more operations. Even though I think the result of this expression will almost always be destructured immediately, it's still nicer.

It better shows the intent in the sense that the methods return not just a pair of two values, but an ordered array. Also it highlightes the similarity to the sort.

Ultimately I don't think it matters much, but IMO the array is better here.

(ps. arrays and tuples also implement into/from)

@Ralith
Copy link
Contributor

Ralith commented Jan 13, 2024

Looking forward to having this! Wanted it today when writing an algorithm that identifies the smaller of two values and walks it towards the greater.

@nmay231
Copy link

nmay231 commented Jan 19, 2024

If the issue with #34162 takes longer than expected to resolve, could minmax (and potentially minmax_by since it's not affected) be stabilized before minmax_by_key is?

(Tbh, I'm not familiar with the requirements of stabilization)

@WaffleLapkin
Copy link
Member Author

@nmay231 that is possible, we do partial stabilization all the time.

@notme4
Copy link

notme4 commented Feb 8, 2024

It better shows the intent in the sense that the methods return not just a pair of two values, but an ordered array.

I disagree, an array is a (possibly unordered) collection of a single kind of thing, whereas a tuple is an ordered collection of (possibly different kinds of) things, so a tuple shows intent better. Also min and max are opposites so they are arguably not even really the same kind of thing, making an array an even weaker option.

That being said the ordering of min and max is arbitrary and they have clear names, so returning a struct could be even clearer. The problem with that being then you have a struct cluttering the namespace.

@bend-n
Copy link
Contributor

bend-n commented May 13, 2024

@notme4 how are tuples any more ordered than arrays?

@George-Ogden
Copy link

I think there's a bug/surprising feature in cmp::minmax:
It requires Ord trait but uses the <= operator, which uses the PartialOrd::partial_cmp method, instead of the Ord::cmp method. This is perhaps an issue with the <= operator, but it makes for some surprising results.

@GoldsteinE
Copy link
Contributor

Ord disagreeing with PartialOrd is a logic error. There should be no surprising results for correct implementations.

@George-Ogden
Copy link

George-Ogden commented Jan 29, 2025

This assumes that everyone's implementation is always correct. If Ord and PartialOrd were to disagree by mistake, cmp::minmax causing an error would encourage you to check the Ord trait, not the PartialOrd trait. And the #derive'd implementation of PartialOrd doesn't default to using Ord when available, which only makes this issue more prevalent. (I raise this because this is an error I made, and I had to spend time debugging it but was looking in the wrong place.)

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Jan 29, 2025

Lots of APIs require an Ord bound but use the comparison operators internally because whatever they’re doing doesn’t make sense for partial orders, but it’s more convenient or more efficient to write them in terms of a particular binary comparison. This is not a minmax specific issue.

Yes, debugging cases where the traits are implemented inconsistently is a pain. But it’s better to catch that at the source with lints like clippy::derive_ord_xor_partial_ord and clippy::derived_hash_with_manual_eq than to complicate all code that tacitly relies on the contract of these traits. Those lints are already deny by default in clippy but not everyone uses clippy. Perhaps uplifting them to rustc is a good idea.

@scottmcm
Copy link
Member

scottmcm commented Jan 29, 2025

TBH, it's arguably the case that the default Ord::min not using PartialOrd is a perf bug. That's what cmp::min used to do https://doc.rust-lang.org/1.15.0/src/core/cmp.rs.html#635-637.

See bugs like #106107, or the current codegen tests that show that le on a tuple is better than cmp().is_le():

//
// These ones are harder, since there are more intermediate values to remove.
//
// `<` seems to be getting lucky right now, so test that doesn't regress.
//
// The others, however, aren't managing to optimize away the extra `select`s yet.
// See <https://github.com/rust-lang/rust/issues/106107> for more about this.
//
// CHECK-LABEL: @check_lt_via_cmp
// CHECK-SAME: (i16 noundef %[[A0:.+]], i16 noundef %[[A1:.+]], i16 noundef %[[B0:.+]], i16 noundef %[[B1:.+]])
#[no_mangle]
pub fn check_lt_via_cmp(a: TwoTuple, b: TwoTuple) -> bool {
// CHECK-DAG: %[[EQ:.+]] = icmp eq i16 %[[A0]], %[[B0]]
// CHECK-DAG: %[[CMP0:.+]] = icmp slt i16 %[[A0]], %[[B0]]
// CHECK-DAG: %[[CMP1:.+]] = icmp ult i16 %[[A1]], %[[B1]]
// CHECK: %[[R:.+]] = select i1 %[[EQ]], i1 %[[CMP1]], i1 %[[CMP0]]
// CHECK: ret i1 %[[R]]
Ord::cmp(&a, &b).is_lt()
}
// CHECK-LABEL: @check_le_via_cmp
// CHECK-SAME: (i16 noundef %[[A0:.+]], i16 noundef %[[A1:.+]], i16 noundef %[[B0:.+]], i16 noundef %[[B1:.+]])
#[no_mangle]
pub fn check_le_via_cmp(a: TwoTuple, b: TwoTuple) -> bool {
// FIXME-CHECK-DAG: %[[EQ:.+]] = icmp eq i16 %[[A0]], %[[B0]]
// FIXME-CHECK-DAG: %[[CMP0:.+]] = icmp sle i16 %[[A0]], %[[B0]]
// FIXME-CHECK-DAG: %[[CMP1:.+]] = icmp ule i16 %[[A1]], %[[B1]]
// FIXME-CHECK: %[[R:.+]] = select i1 %[[EQ]], i1 %[[CMP1]], i1 %[[CMP0]]
// FIXME-CHECK: ret i1 %[[R]]
Ord::cmp(&a, &b).is_le()
}

If the code only needs a two-way check, using a three-way check and trying to optimize that down is tricky, and empyrically often doesn't optimize as well.

EDIT a few weeks later: ironically, now that LLVM20 landed things are working better #137197 -- dunno if that's true in general, though.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 1, 2025
Implement all mix/max functions in a (hopefully) more optimization amendable way

Previously the graph was like this:

```
min -> Ord::min -> min_by -> match on compare() (in these cases compare = Ord::cmp)
                                      ^
                                      |
                                 min_by_key
```
now it looks like this:
```
min -> Ord::min -> `<=` <- min_by_key

min_by -> `Ordering::is_le` of `compare()`
```
(`max*` and `minmax*` are the exact same, i.e. they also use `<=` and `is_le`)

I'm not sure how to test this, but it should probably be easier for the backend to optimize.

r? `@scottmcm`
cc rust-lang#115939 (comment)
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 1, 2025
Rollup merge of rust-lang#136307 - WaffleLapkin:minminmin, r=scottmcm

Implement all mix/max functions in a (hopefully) more optimization amendable way

Previously the graph was like this:

```
min -> Ord::min -> min_by -> match on compare() (in these cases compare = Ord::cmp)
                                      ^
                                      |
                                 min_by_key
```
now it looks like this:
```
min -> Ord::min -> `<=` <- min_by_key

min_by -> `Ordering::is_le` of `compare()`
```
(`max*` and `minmax*` are the exact same, i.e. they also use `<=` and `is_le`)

I'm not sure how to test this, but it should probably be easier for the backend to optimize.

r? `@scottmcm`
cc rust-lang#115939 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests