Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RangeFrom::max should return None #47169

Closed
varkor opened this issue Jan 4, 2018 · 6 comments
Closed

RangeFrom::max should return None #47169

varkor opened this issue Jan 4, 2018 · 6 comments

Comments

@varkor
Copy link
Member

varkor commented Jan 4, 2018

Iterator::max states "If the iterator is empty, None is returned.". In the case of an unbounded range, we know that this maximum does not exist. This means RangeFrom::max is consigned to looping forever.

If the documentation for Iterator::max were rewritten: "If the iterator is empty, or if the maximum element is unbounded, None is returned.", this could then be implemented as special cases for unbounded iterators (e.g. RangeFrom), which is far more useful and intuitive.

The same complaint goes for Iterator::last.

@hanna-kruppe
Copy link
Contributor

What does "the maximum element is unbounded" mean exactly? The only interpretation I can find that applies to FromRange is that it never returns None from next. Note however that it eventually panics when overflow checks are enabled, so it is not really an infinite iterator.

Going with that interpretation, it's true that max or last on such an iterator is not very useful, but changing the definition of the methods is problematic for several reasons:

  • It's not actually possible to detect all "unbounded iterators", therefore at best we could offer this for a few select iterators.
  • Accordingly, the docs would give a confusing and (rightly) not-very-reassuring guarantee like "If the iterator never returns None, this method may return None".
  • Code might (rightfully given the current docs) assume that a None means an empty iterator. A behavioral change like this, even if it "just" removes a panic or non-termination, is not to be taken lightly.

@varkor
Copy link
Member Author

varkor commented Jan 4, 2018

I mean that, for any element x returned by the iterator, there is guaranteed to be an element y > x that is also returned (or it panics), which in this case is equivalent to never returning None as the range is monotonically increasing.

It does cause a behaviour change, but I can't imagine any users are relying on the termination behaviour of an unbounded iterator like this (especially considering the overflow checks are going to be disabled in release mode). Semantically, I think it's more intuitive to think of Iterator::max returning None if there is no maximum element (which is true of an empty iterator, and an infinitely increasing one), rather than if the iterator is empty (we have other methods for that).

The fact that we could only provide this behaviour for some iterators, and others would be consigned to their infinite loops, does decrease the usefulness of such a change, but I think it is still more useful than the current behaviour.

@hanna-kruppe
Copy link
Contributor

I mean that, for any element x returned by the iterator, there is guaranteed to be an element y > x that is also returned (or it panics), which in this case is equivalent to never returning None as the range is monotonically increasing.

This definition does not match RangeFrom when overflow checks are disabled.

Semantically, I think it's more intuitive to think of Iterator::max returning None if there is no maximum element (which is true of an empty iterator, and an infinitely increasing one), rather than if the iterator is empty (we have other methods for that).

But that is not what's documented, and not what the default implementation does, and not something the default implementation can do.

I also don't buy that None for empty iterators is not intuitive. The maximum of an empty set/sequence indeed does not exist.

The fact that we could only provide this behaviour for some iterators, and others would be consigned to their infinite loops, does decrease the usefulness of such a change, but I think it is still more useful than the current behaviour.

It's not just that this change will fail to achieve its intended effect for some iterators. It will muddy the definition of max for all iterators. People not reading the docs may come to rely on it and write code that gets bitten by more complex iterators for which the "shortcut" is not implemented. All to address a mistake that is annoying, yes, but also trivial to find with a debugger.

I'm actually not even sure if returning None is better than looping infinitely -- it won't hang the program, but code that does anything other than .max().unwrap() will now silently carry on execution (possibly in the belief that the iterator is empty -- since the iterator was consumed, you can't even check afterwards) despite the fact that calling max on that iterator was likely a logic bug. That doesn't seem great for program reliability.

@varkor
Copy link
Member Author

varkor commented Jan 4, 2018

This definition does not match RangeFrom when overflow checks are disabled.

That's true. More special casing here would be possible, but probably not desirable. The issue is the distinction between what someone expects RangeFrom to mean — at the moment it is very much defined according to implementation details, not mathematically. The most consistent choice is probably the best one, which in this case is sticking with the existing implementation.

I also don't buy that None for empty iterators is not intuitive. The maximum of an empty set/sequence indeed does not exist.

I agree that None is intuitive for empty iterators. I think it's not the only case in which it's intuitive, though.

It's not just that this change will fail to achieve its intended effect for some iterators. It will muddy the definition of max for all iterators. People not reading the docs may come to rely on it and write code that gets bitten by more complex iterators for which the "shortcut" is not implemented.

That's a compelling argument not to change. Without a way to enforce it, this is just going to end up being confusing. Perhaps there's an opportunity for more sensible behaviour with UnboundedIterator, which could return None by default instead.

@oberien
Copy link
Contributor

oberien commented Jan 4, 2018

With every iterator function I assume that None indicates that the iterator is empty. If .max() returned None, I'd assume the iterator to be empty instead of unbounded.
I do see that RangeFrom does not have a theoretical max value (in practice it has in most cases), so None might be a reasonable result from call .max().
But this would completely change the semantic of the max function. With this addition its definition would change from "get the maximum element or None if the iterator is empty" to "return None if the iterator is empty or does not have a maximum element, otherwise return the maximum".
Unfortunately, it's not easily possible to verify the "does not have a maximum element" property. Even for RangeFrom one could implement Step for a newtype to cyclic iterate over the values 0 through 10. That would allow a RangeFrom cycling over those values:

#![feature(step_trait)]
use std::iter::Step;
use std::ops::RangeFrom;

#[derive(Clone, PartialEq, PartialOrd, Debug)]
struct Foo(u8);

impl Step for Foo {
    // dummies
    fn steps_between(_: &Foo, _: &Foo) -> Option<usize> { unimplemented!() }
    fn replace_one(&mut self) -> Foo { unimplemented!() }
    fn replace_zero(&mut self) -> Foo { unimplemented!() }
    fn sub_one(&self) -> Foo { unimplemented!() }
    fn add_usize(&self, _: usize) -> Option<Foo> { unimplemented!() }
    
    // required for this example
    fn add_one(&self) -> Foo {
        if self.0 >= 10 {
            Foo(0)
        } else {
            Foo(self.0 + 1)
        }
    }
}

fn main() {
    let range = RangeFrom { start: Foo(0) };
    for i in range.take(20) {
        println!("{:?}", i);
    }
}

playpen

This RangeFrom does indeed have a maximum element, which is 10. Returning None from its max function would be highly unexpected. I do agree, that not returning at all is unexpected as well.

If we discuss specializing max for RangeFrom, we should also consider other consumers like min, gt, lt and similar. Additionally, we should also specialize Repeat::max, which is in fact actually defined (the repeated element itself is its own minimum and maximum).

#47082 does not help this issue, e.g. because Repeat. This issue requires a similar approach (i.e. introduce a marker trait for RangeFrom and any valid combinator on it) and specialize .max() on those.
In general I think that calling any consumer on any unbounded iterator is most likely indicating a bug in the code and not assuming some special behaviour. That is why I suggested producing compiler warnings on any consumer call on unbounded iterator in the PR.

@varkor
Copy link
Member Author

varkor commented Jan 12, 2018

I agree that the issues with this outweigh any benefits. Closing accordingly — thanks for your detailed contributions @rkruppe and @oberien!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants