RFC: automatically implement equality and ordering using Eq and Ord #3195

davidhewitt · 2023-05-31T14:41:35Z

I had a sudden insight today that we can automatically generate implementations of equality and ordering based on the presence of Eq and Ord implementations for #[pyclass]. (Note that Ord: Eq.) The implementation selects for these traits using autoderef specialization.

I think this could be very convenient, but it has the potential to be breaking because existing types will suddenly have new behaviour in Python. If we went with this automatic implementation, we could land it in 0.20 with appropriate documentation.

I think at the very least we should have #[pyclass(eq = false)] and #[pyclass(order = false)] opt-outs (names matching the same functionality in Python @dataclass, although note the defaults here are not True and False respectively but instead based on whether T implements Eq or Ord).

Design discussions:

Alternative would be to always default to false, and require eq = true (and order = true) to opt-in to adding these implementations. This would be non-breaking but I think is not as nice for future usage.
- Maybe we can use this as an opt-in over a deprecation period?
What to do with interactions around subclasses and extensions? I am certain the default implementations I have currently written do the wrong thing for #[pyclass(extends = ...)] and also if T is subclassed.
Also there may be an interaction between this and the implementation for Support __eq__ for pyclass #2089. Maybe it's better to land that first?

davidhewitt · 2023-05-31T15:16:02Z

If we like this, I think we can do the same thing with Hash (only for frozen classes by default). Potentially also Display and Debug could be used for str and repr (though maybe that's less desirable).

adamreichold · 2023-05-31T15:51:48Z

Also there may be an interaction between this and the implementation for #2089. Maybe it's better to land that first?

I think we could land this first if it is easier to implement, but we should make sure that we do not block manually implementing __eq__ before committing to an API.

Reading a comment from there and

Alternative would be to always default to false, and require eq = true (and order = true) to opt-in to adding these implementations. This would be non-breaking but I think is not as nice for future usage.

I would actually prefer if this worked like #[derive(SomeTrait)], i.e. it is purely opt-in using something like #[pyclass(eq = true)] to follow the principle of least surprise. (I also feel that this is functionality would be more Rusty than Pythonic being based on trait implementations instead of implementing magic methods and hence think it would be sensible to align this with the Rust-side precedence.)

If we like this, I think we can do the same thing with Hash (only for frozen classes by default). Potentially also Display and Debug could be used for str and repr (though maybe that's less desirable).

As a #[derive(..)]-like opt-in mechanism, I would support making the support as broad as possible.

I am not sure though that the concerns w.r.t. frozen are specific to Hash though: Would ordering-based containers have exactly the same issue with Ord and pervasive resp. interior mutability? Or this is mainly because dict is so common on the Python side? Additionally, with opt-in, I don't think the additional interaction with frozen would be necessary?

adamreichold · 2023-05-31T16:06:00Z

As my role as local resident autoref specialization sceptic also seems to be a given, I'd also like to add that doing this opt-in would probably make that unnecessary as well? I would see that as a benefit because while autoref specialization is an ingenious trick, it is also a somewhat indirect/implicit method of realizing specialization that does make the code harder to maintain IMHO.

davidhewitt · 2023-05-31T17:08:03Z

As my role as local resident autoref specialization sceptic also seems to be a given, I'd also like to add that doing this opt-in would probably make that unnecessary as well? I would see that as a benefit because while autoref specialization is an ingenious trick, it is also a somewhat indirect/implicit method of realizing specialization that does make the code harder to maintain IMHO.

I think I see both sides of the coin. I agree that opt-in feels more Rusty and avoids the compile overhead and maintenance complexity of the specialization.

On the other hand, I think there's quite a strong argument that the user has already opted in by deriving or implementing these traits for a #[pyclass], and more likely than not it's a bug if they then forget to also add #[pyclass(eq = true)].

I think for me it may hinge what the solution for inheritance looks like. If we can only safely implement these methods in absence of insurance then the rules get very complicated for automatic implementation. If the inheritance cases can be cleanly handled by our automatic implementation then I think the bet that doing this automatically helps the user avoid the bug of forgetting to opt in is a strong motivator to me.

I am not sure though that the concerns w.r.t. frozen as specific to Hash though: Would ordering-based containers have exactly the same issue with Ord and pervasive resp. interior mutability?

Great question. I suppose ordering-based containers in Rust have the same problem with interior mutability? So perhaps automatic Ord is also only appropriate for frozen classes.

adamreichold · 2023-05-31T17:36:19Z

src/impl_/pyclass.rs

+        Some(CompareOp::Eq) => Ok(PyBool::new(py, slf.eq(&other)).into_ptr()),
+        Some(CompareOp::Ne) => Ok(PyBool::new(py, slf.ne(&other)).into_ptr()),
+        Some(_) => Ok(py.NotImplemented().into_ptr()),
+        None => unreachable!("invalid compareop"),


Maybe just CompareOp::from_raw(op).expect("invalid compareop")?

lifthrasiir · 2023-06-01T08:36:49Z

To me it mostly boils down to whether generated Rust and Python types should behave same (to the reasonable extent) or not. This is a philosophical question and I have no opinion nor strong preference about how it should be, so I won't assume it as a fact.

If Rust and Python types should act identically, I think not only this PR should be merged, but also __richcmp__ should never be user-implementable in principle. Maintaining two versions of comparison logic is not ideal, even when one refers to another. In practice __richcmp__ might be still needed because of subtle differences, but it should be regarded as an escape hatch, not a typical use case.

OTOH if they may behave differently, I don't think this PR is a good idea. It is still good to have a shortcut for deriving a Python __richcmp__ method from existing Rust impls, but that should be explicit, so that the Rust code clearly shows indicates whether the Python type is indeed comparable. There may be multiple possible designs:

If #[pyclass] and #[derive(PartialOrd)] are used together (for example), I do think the intent is clear and __richcmp__ should be automatically derived as well.

Pros: No manual intervention with the least breakage.

Cons: This is very sensitive to the attribute order, and cannot handle separate impl PartialOrd for MyClass. This design might be combined with other partial designs.
Extend #[pymethods] so that it may be applied to impl PartialOrd for MyClass as well. Internally this would define something like impl PyPartialOrd for MyClass {}, which can be separately collected without inventory.

Pros: Mostly consistent with the current role of #[pymethods].

Cons: This doesn't work well with #[derive(PartialOrd)]. The earlier design may be used to fix this, but a single comprehensive design would be more desirable.
Add a derive macro #[derive(__richcmp__)] or similar.

Pros: Covers all cases. Results in an idiomatic Rust code.

Cons: Not sure if we can safely use this name. We also have to make sure that this is only usable for PyClass. The distance between #[derive(__richcmp__)] and #[pymethods] is also a bit suboptimal.
Similar, but put derived methods to the #[pymethods] attribute (as opposed to #[pyclass]). The actual syntax is up to debate (e.g. #[pymethods(__richcmp__, ...)], #[pymethods(derive = (__richcmp__, ...))], #[pymethods] #[derive(__richcmp__)] etc).

Pros: Covers all cases. Consistent with the current role of #[pymethods].

Cons: If multiple #[pymethods] exist, only one of them should contain such attributes which might not be clear enough.
Allow a stub method with only a partial signature and no method body (e.g. fn __richcmp__();). #[pymethods] will detect such stub methods and generate them.

Pros: Most flexible, as we may specify as many information as we want. It does look like normal functions.

Cons: Not sure if it's actually syntactically allowed before the expansion. (If this is the case, we may use a macro item as a workaround.) The exact syntax is a whole bikeshed to paint.

davidhewitt · 2023-06-01T10:55:33Z

This is an interesting question. Overall I have had the philosophy that #[pyclass] should match the Python class as much as possible, because this is what users expect. Where there are differences I think we should take care to document them.

I am mostly in favour of not making __richcmp__ directly implementable; I think the only argument to that would be whether any "split-out" implementation from #2089 has any performance difference to a direct __richcmp__. We will have to see about that in the future.

On reflection about whether this automatic implementation should be merged - I think PyO3 may not be ready for this yet, though I would like us to still keep open about it in the future. I think merging the opt-in forms eq = true, hash = true etc would be a nice end-user feature which could be deprecated in the future if we wanted to make automatic implementations later.

I think the clear next step is to complete #2089, as that's a clearly desirable addition and we'll then understand what implementation constraints we'd be working with for all of the possibilities discussed here. I'll try to get to that sometime in the near-ish future.

automatically implement equality and ordering using Eq and Ord

2223c02

adamreichold reviewed May 31, 2023

View reviewed changes

davidhewitt closed this Jun 1, 2023

davidhewitt mentioned this pull request Jun 1, 2023

__anext__ should be able to return &PyAny #3190

Closed

adamreichold mentioned this pull request Jul 22, 2023

Support copy, deepcopy, eq on types quickwit-oss/tantivy-py#99

Merged

davidhewitt mentioned this pull request Oct 30, 2023

"C" Enums have ordering but not hash defined #3558

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: automatically implement equality and ordering using Eq and Ord #3195

RFC: automatically implement equality and ordering using Eq and Ord #3195

davidhewitt commented May 31, 2023

davidhewitt commented May 31, 2023

adamreichold commented May 31, 2023 •

edited

Loading

adamreichold commented May 31, 2023 •

edited

Loading

davidhewitt commented May 31, 2023

adamreichold May 31, 2023

lifthrasiir commented Jun 1, 2023

davidhewitt commented Jun 1, 2023

RFC: automatically implement equality and ordering using Eq and Ord #3195

RFC: automatically implement equality and ordering using Eq and Ord #3195

Conversation

davidhewitt commented May 31, 2023

davidhewitt commented May 31, 2023

adamreichold commented May 31, 2023 • edited Loading

adamreichold commented May 31, 2023 • edited Loading

davidhewitt commented May 31, 2023

adamreichold May 31, 2023

Choose a reason for hiding this comment

lifthrasiir commented Jun 1, 2023

davidhewitt commented Jun 1, 2023

adamreichold commented May 31, 2023 •

edited

Loading

adamreichold commented May 31, 2023 •

edited

Loading