RangeFrom::max should return None #47169

varkor · 2018-01-04T00:36:15Z

Iterator::max states "If the iterator is empty, None is returned.". In the case of an unbounded range, we know that this maximum does not exist. This means RangeFrom::max is consigned to looping forever.

If the documentation for Iterator::max were rewritten: "If the iterator is empty, or if the maximum element is unbounded, None is returned.", this could then be implemented as special cases for unbounded iterators (e.g. RangeFrom), which is far more useful and intuitive.

The same complaint goes for Iterator::last.

The text was updated successfully, but these errors were encountered:

hanna-kruppe · 2018-01-04T01:50:49Z

What does "the maximum element is unbounded" mean exactly? The only interpretation I can find that applies to FromRange is that it never returns None from next. Note however that it eventually panics when overflow checks are enabled, so it is not really an infinite iterator.

Going with that interpretation, it's true that max or last on such an iterator is not very useful, but changing the definition of the methods is problematic for several reasons:

It's not actually possible to detect all "unbounded iterators", therefore at best we could offer this for a few select iterators.
Accordingly, the docs would give a confusing and (rightly) not-very-reassuring guarantee like "If the iterator never returns None, this method may return None".
Code might (rightfully given the current docs) assume that a None means an empty iterator. A behavioral change like this, even if it "just" removes a panic or non-termination, is not to be taken lightly.

varkor · 2018-01-04T11:21:31Z

I mean that, for any element x returned by the iterator, there is guaranteed to be an element y > x that is also returned (or it panics), which in this case is equivalent to never returning None as the range is monotonically increasing.

It does cause a behaviour change, but I can't imagine any users are relying on the termination behaviour of an unbounded iterator like this (especially considering the overflow checks are going to be disabled in release mode). Semantically, I think it's more intuitive to think of Iterator::max returning None if there is no maximum element (which is true of an empty iterator, and an infinitely increasing one), rather than if the iterator is empty (we have other methods for that).

The fact that we could only provide this behaviour for some iterators, and others would be consigned to their infinite loops, does decrease the usefulness of such a change, but I think it is still more useful than the current behaviour.

hanna-kruppe · 2018-01-04T11:38:58Z

I mean that, for any element x returned by the iterator, there is guaranteed to be an element y > x that is also returned (or it panics), which in this case is equivalent to never returning None as the range is monotonically increasing.

This definition does not match RangeFrom when overflow checks are disabled.

Semantically, I think it's more intuitive to think of Iterator::max returning None if there is no maximum element (which is true of an empty iterator, and an infinitely increasing one), rather than if the iterator is empty (we have other methods for that).

But that is not what's documented, and not what the default implementation does, and not something the default implementation can do.

I also don't buy that None for empty iterators is not intuitive. The maximum of an empty set/sequence indeed does not exist.

The fact that we could only provide this behaviour for some iterators, and others would be consigned to their infinite loops, does decrease the usefulness of such a change, but I think it is still more useful than the current behaviour.

It's not just that this change will fail to achieve its intended effect for some iterators. It will muddy the definition of max for all iterators. People not reading the docs may come to rely on it and write code that gets bitten by more complex iterators for which the "shortcut" is not implemented. All to address a mistake that is annoying, yes, but also trivial to find with a debugger.

I'm actually not even sure if returning None is better than looping infinitely -- it won't hang the program, but code that does anything other than .max().unwrap() will now silently carry on execution (possibly in the belief that the iterator is empty -- since the iterator was consumed, you can't even check afterwards) despite the fact that calling max on that iterator was likely a logic bug. That doesn't seem great for program reliability.

varkor · 2018-01-04T12:21:12Z

This definition does not match RangeFrom when overflow checks are disabled.

That's true. More special casing here would be possible, but probably not desirable. The issue is the distinction between what someone expects RangeFrom to mean — at the moment it is very much defined according to implementation details, not mathematically. The most consistent choice is probably the best one, which in this case is sticking with the existing implementation.

I also don't buy that None for empty iterators is not intuitive. The maximum of an empty set/sequence indeed does not exist.

I agree that None is intuitive for empty iterators. I think it's not the only case in which it's intuitive, though.

It's not just that this change will fail to achieve its intended effect for some iterators. It will muddy the definition of max for all iterators. People not reading the docs may come to rely on it and write code that gets bitten by more complex iterators for which the "shortcut" is not implemented.

That's a compelling argument not to change. Without a way to enforce it, this is just going to end up being confusing. Perhaps there's an opportunity for more sensible behaviour with UnboundedIterator, which could return None by default instead.

oberien · 2018-01-04T15:07:13Z

With every iterator function I assume that None indicates that the iterator is empty. If .max() returned None, I'd assume the iterator to be empty instead of unbounded.
I do see that RangeFrom does not have a theoretical max value (in practice it has in most cases), so None might be a reasonable result from call .max().
But this would completely change the semantic of the max function. With this addition its definition would change from "get the maximum element or None if the iterator is empty" to "return None if the iterator is empty or does not have a maximum element, otherwise return the maximum".
Unfortunately, it's not easily possible to verify the "does not have a maximum element" property. Even for RangeFrom one could implement Step for a newtype to cyclic iterate over the values 0 through 10. That would allow a RangeFrom cycling over those values:

#![feature(step_trait)]
use std::iter::Step;
use std::ops::RangeFrom;

#[derive(Clone, PartialEq, PartialOrd, Debug)]
struct Foo(u8);

impl Step for Foo {
    // dummies
    fn steps_between(_: &Foo, _: &Foo) -> Option<usize> { unimplemented!() }
    fn replace_one(&mut self) -> Foo { unimplemented!() }
    fn replace_zero(&mut self) -> Foo { unimplemented!() }
    fn sub_one(&self) -> Foo { unimplemented!() }
    fn add_usize(&self, _: usize) -> Option<Foo> { unimplemented!() }
    
    // required for this example
    fn add_one(&self) -> Foo {
        if self.0 >= 10 {
            Foo(0)
        } else {
            Foo(self.0 + 1)
        }
    }
}

fn main() {
    let range = RangeFrom { start: Foo(0) };
    for i in range.take(20) {
        println!("{:?}", i);
    }
}

playpen

This RangeFrom does indeed have a maximum element, which is 10. Returning None from its max function would be highly unexpected. I do agree, that not returning at all is unexpected as well.

If we discuss specializing max for RangeFrom, we should also consider other consumers like min, gt, lt and similar. Additionally, we should also specialize Repeat::max, which is in fact actually defined (the repeated element itself is its own minimum and maximum).

#47082 does not help this issue, e.g. because Repeat. This issue requires a similar approach (i.e. introduce a marker trait for RangeFrom and any valid combinator on it) and specialize .max() on those.
In general I think that calling any consumer on any unbounded iterator is most likely indicating a bug in the code and not assuming some special behaviour. That is why I suggested producing compiler warnings on any consumer call on unbounded iterator in the PR.

varkor · 2018-01-12T11:45:45Z

I agree that the issues with this outweigh any benefits. Closing accordingly — thanks for your detailed contributions @rkruppe and @oberien!

hanna-kruppe mentioned this issue Jan 12, 2018

Add iterator specialisations for Repeat and Cycle #47370

Closed

varkor closed this as completed Jan 12, 2018

oberien mentioned this issue Jan 15, 2018

Add UnboundedIterator Trait #47082

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RangeFrom::max should return None #47169

RangeFrom::max should return None #47169

varkor commented Jan 4, 2018 •

edited

Loading

hanna-kruppe commented Jan 4, 2018

varkor commented Jan 4, 2018 •

edited

Loading

hanna-kruppe commented Jan 4, 2018

varkor commented Jan 4, 2018

oberien commented Jan 4, 2018

varkor commented Jan 12, 2018

RangeFrom::max should return None #47169

RangeFrom::max should return None #47169

Comments

varkor commented Jan 4, 2018 • edited Loading

hanna-kruppe commented Jan 4, 2018

varkor commented Jan 4, 2018 • edited Loading

hanna-kruppe commented Jan 4, 2018

varkor commented Jan 4, 2018

oberien commented Jan 4, 2018

varkor commented Jan 12, 2018

varkor commented Jan 4, 2018 •

edited

Loading

varkor commented Jan 4, 2018 •

edited

Loading