Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JDK 21 Sequenced Collections #6903

Open
4 tasks done
rhys-saldanha opened this issue Jan 6, 2024 · 10 comments
Open
4 tasks done

Support JDK 21 Sequenced Collections #6903

rhys-saldanha opened this issue Jan 6, 2024 · 10 comments
Labels

Comments

@rhys-saldanha
Copy link

rhys-saldanha commented Jan 6, 2024

1. What are you trying to do?

Please see JEP-431 for background on what a Sequenced Collection is.

The javadoc for ImmutableCollection mentions that most Guava Collections have a well-defined iteration order. Indeed, I often rely on this behaviour. This issue is to discuss support for the SequencedCollections interface, in order to better communicate that Guava Collections have this property.

2. What's the best code you can write to accomplish that without the new feature?

This example does not make much sense without the context of the added methods in SequencedCollection, however:

final var foo = ImmutableSet.of(1,2,3);
foo.asList().get(0); // = 1

3. What would that same code look like if we added your feature?

Sequenced Collections add new convenience methods that arise from having a defined iteration order. Applying this to (for example) ImmutableSet, which preserves insertion order, would result in the following code being valid:

final var foo = ImmutableSet.of(1,2,3);
foo.getFirst(); // = 1

(Optional) What would the method signatures for your feature look like?

No response

Concrete Use Cases

I have not yet used Guava with JDK 21.

Please see related Guava discussion: https://groups.google.com/g/guava-discuss/c/IXWZCHn7yJs

Packages

com.google.common.collect

Checklist

@rhys-saldanha rhys-saldanha added the type=addition A new feature label Jan 6, 2024
@cpovirk
Copy link
Member

cpovirk commented Jan 6, 2024

One thing that Rhys mentioned that I hadn't considered before was the documentation aspect: For someone new to, say, ImmutableMap, it could be useful to see that it's an implementation of SequencedMap. While we try to drive home the benefits of our immutable collections, a page like https://guava.dev/ImmutableMap doesn't mention ordering until you follow a link or at least read about a factory method like of.

Then there's the benefit of the methods themselves, like getFirst, as mentioned above. We could add those without worrying about some of the complexities that come with implementing the interface. That said, as discussed in the link above, that still leaves a couple challenges. I think the could both be solved by using --release 8, but that requires dealing with Unsafe and other unavailable APIs. --release 8 has a pretty good chance of becoming important someday (for this and other reasons), so we should reevaluate the SequencedCollection methods at that point.

@fmeum
Copy link

fmeum commented Feb 4, 2025

There is another aspect to this that, in my opinion, extends beyond documentation and instead would improve correctness. The ImmutableCollection javadocs say:

On the other hand, a parameter type of ImmutableList is generally a nuisance to callers. Instead, accept Iterable and have your method or constructor body pass it to the appropriate copyOf method itself.

That practice is problematic in a subtle way: It's possible to pass in arguments with unspecified iteration order (e.g. HashSet) and have them turned into ImmutableCollections that have specified ordering. Consumers of the ImmutableCollection may be led to think that they can depend on a "sensible" ordering, but that may not be the case depending on the source collection.

As a concrete example, I'm currently hunting down instances of this in Bazel by patching copyOf to throw on unsequenced source collections and found a number of cases of such "sequenced-washing". If immutable collections implemented the Sequenced* interfaces, there would be a way to enforce deterministic, specified sequencing at compile time. In fact, if we were to rewrite ImmutableMap from scratch, I would propose limiting copyOf to accept a SequencedMap and requiring an explicit copyOfUnsequenced for arbitrary Maps.

I would be happy to help with this as well as the build system changes needed to make this possible. Where are we with this?

@cpovirk
Copy link
Member

cpovirk commented Feb 4, 2025

If immutable collections implemented the Sequenced* interfaces, there would be a way to enforce deterministic, specified sequencing at compile time.

Could you explain a little more what you have in mind? It sounds like maybe you want to be able to consistently define APIs in Bazel to require SequencedCollection and SequencedMap but that that's not practical now because you wouldn't be able to pass ImmutableSet or ImmutableMap in those cases? Is that right, and is there more than would be helpful (in the category of things that we could do without starting from scratch)?

Unfortunately, I don't think we'll be in a position to make that particular change until Guava requires Java 21+. (And we're still waiting to require Java 11.) Even with a multi-release jar, the APIs across releases are supposed to match, and if they don't, bad things happen.

(Sequenced-washing is definitely a problem, one that we also see with content that gets stringified or even copied to some other kind of List or ordered collection before any of the immutable collections see it. It may well come up with streams, too, where it may be even harder to detect....)

@fmeum
Copy link

fmeum commented Feb 4, 2025

It sounds like maybe you want to be able to consistently define APIs in Bazel to require SequencedCollection and SequencedMap but that that's not practical now because you wouldn't be able to pass ImmutableSet or ImmutableMap in those cases?

Yes, that's exactly what I would like to be able to do.

Unfortunately, I don't think we'll be in a position to make that particular change until Guava requires Java 21+.

I read the issue you linked, but I'm still not sure why we can't add support for this sooner, say like this:

  • Copy (modulo copyrighted content) the Sequenced* interfaces provided by Java 21 to Guava, but rename them to Sequenced*Compat.
  • Let the Guava collection classes implement these interfaces.
  • Build a multi-release JAR and in the subdirectory for Java 21, add alternative Sequenced*Compat interfaces that simply extend the corresponding Sequenced* interface.

We of course do need to be careful about c.g.c.c.Sequenced*Compat exactly matching j.u.Sequenced*, but that should require a one-time effort.

Do you see a problem with this approach?

@cpovirk
Copy link
Member

cpovirk commented Feb 4, 2025

Sorry for not providing a shorter version of that.

My understanding is that tools do not reliably read the Java-21 section of the multi-release jar when building for Java 21. So you wouldn't necessarily be able to use ImmutableCollection "as a" SequencedCollection. And my understanding is that this isn't "a bug" because the API (in this case, the API of SequencedCollectionCompat, specifically its extends clause) doesn't match across releases, as is required.

Now, I'm actually not sure to what extent the JDK developers have gotten into the weeds of what it means for the API to "match": It seems as if it would be useful—and not harmful—to be able to do exactly what you describe. It's conceivable that we could get a clarification or change from the JDK developers, followed by changes to various tools to make those tools look at the release that they're targeting—presumably followed by changes by changes to various projects to configure those tools correctly in cases where they were getting away with doing things wrong before :) Sadly, I wouldn't hold my breath, but maybe it's a road that we should consider starting down, even if it wouldn't pay off for many years.

(I do think there's a decent chance that new versions of Guava will someday require some version of Java that has built-in support for nullness. It's even conceivable that the next jump we made will be directly from Java 8 to that version, in which case we could pick up Java-21 changes at that point. But that's still far off and hypothetical.)

@fmeum
Copy link

fmeum commented Feb 4, 2025

That sounds like a hen-and-egg problem to me: As long as multi-release JARs aren't properly supported by all build tools, developers aren't incentivized to make use of them. But as long as they don't, there is little value in fixing the bugs in all these tools. :-)

That said, wouldn't it be okay for Guava to add this kind of feature with the understanding that tools may need to fix bugs before they can make use of it? If a certain javac alternative doesn't see the SequencedCollection interface on ImmutableList, then it can't use the new feature, but that's not worse then the current situation.

All of this of course requires clarification from the JDK devs on whether they intend to support this use case. I could ask on the mailing lists.

@cpovirk
Copy link
Member

cpovirk commented Feb 4, 2025

Yes, I think (no guarantees, but I think) that I'd be comfortable giving this a try if we had word from the JDK developers that it's an intended use case. It would be good to push the ecosystem forward to support the feature in that case. If you're up for asking, that would be great, and I'm happy to chime in in support if you CC [email protected].

@cpovirk
Copy link
Member

cpovirk commented Feb 5, 2025

I was reminded by raphw/asm-jdk-bridge#3 that jar actually performs some checking for API compatibility.

That checking doesn't appear to mind a difference in supertypes:

$ tail -n +1 */src/*.java
==> base/src/Foo.java <==
public interface Foo<E> {}

==> twentyone/src/Foo.java <==
import java.util.SequencedCollection;

public interface Foo<E> extends SequencedCollection<E> {}

$ javac -d base/bin --release 8 base/src/*.java && javac -d twentyone/bin --release 21 twentyone/src/*.java && jar --create --file mrjar.jar -C base/bin . --release 21 -C twentyone/bin .
warning: [options] source value 8 is obsolete and will be removed in a future release
warning: [options] target value 8 is obsolete and will be removed in a future release
warning: [options] To suppress warnings about obsolete options, use -Xlint:-options.
3 warnings

Contrast that to what happens if I stick a direct method declaration into only twentyone:

$ javac -d base/bin --release 8 base/src/*.java && javac -d twentyone/bin --release 21 twentyone/src/*.java && jar --create --file mrjar.jar -C base/bin . --release 21 -C twentyone/bin .
warning: [options] source value 8 is obsolete and will be removed in a future release
warning: [options] target value 8 is obsolete and will be removed in a future release
warning: [options] To suppress warnings about obsolete options, use -Xlint:-options.
3 warnings
entry: META-INF/versions/21/Foo.class, contains a class with different api from earlier version
invalid multi-release jar file mrjar.jar deleted

I don't mean to present that as conclusive, given that the JEP left things a bit up in the air, including noting that enforcement could be incomplete. But it's at least worth noting that this one specific potential obstacle does not exist, and that may be a blurry sign to us of how upstream has been thinking about "API compatibility"—or, alternatively, just of what they found easiest to implement or most likely to catch real-world problems :)

@fmeum
Copy link

fmeum commented Feb 5, 2025

Nice find!

I looked through the history of that validation and found that its original version somewhat explicitly doesn't consider interfaces: https://github.com/openjdk/jdk/blob/def15478ebc213eeff8eb4da9178f8bac4c72604/jdk/src/jdk.jartool/share/classes/sun/tools/jar/FingerPrint.java#L240-L248

If I don't get a reply on the mailing list, I could try to add a test to the jar validation that codifies this behavior.

@kevinb9n
Copy link
Contributor

kevinb9n commented Feb 6, 2025

I suspect that mr-jars aren't going to be an easy way out of this, and the only viable path may be pushing to #6614 and beyond.

fmeum added a commit to fmeum/bazel that referenced this issue Feb 6, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has been created from a source with unspecified iteration order, such as `HashMap`. Since `ImmutableMap` itself prescribes that it preserves insertion order, it's reasonable for consumers to expect a specified, reasonable, deterministic order. While the current implementation of `HashMap` in the OpenJDK still results in a deterministic order, this kind of "sequenced-washing" doesn't cause non-determinism, but it still runs the risk of having the order of elements change between JDK versions or vendors.

The locations changed in this PR have been identified by instrumenting the factory methods of the `Immutable{List,Map,Set}` classes to track whether they 1) have been created from an unsequenced collection and 2) are iterated over. I then manually picked out those cases in which I couldn't rule out that a change in order would be observable to users.

Related to google/guava#6903 (comment)
fmeum added a commit to fmeum/bazel that referenced this issue Feb 6, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has been created from a source with unspecified iteration order, such as `HashMap`. Since `ImmutableMap` itself prescribes that it preserves insertion order, it's reasonable for consumers to expect a specified, reasonable, deterministic order. While the current implementation of `HashMap` in the OpenJDK still results in a deterministic order, this kind of "sequenced-washing" doesn't cause non-determinism, but it still runs the risk of having the order of elements change between JDK versions or vendors.

The locations changed in this PR have been identified by instrumenting the factory methods of the `Immutable{List,Map,Set}` classes to track whether they 1) have been created from an unsequenced collection and 2) are iterated over. I then manually picked out those cases in which I couldn't rule out that a change in order would be observable to users.

Related to google/guava#6903 (comment)
copybara-service bot pushed a commit to bazelbuild/bazel that referenced this issue Feb 7, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has been created from a source with unspecified iteration order, such as `HashMap`. Since `ImmutableMap` itself prescribes that it preserves insertion order, it's reasonable for consumers to expect a specified, reasonable, deterministic order. While the current implementation of `HashMap` in the OpenJDK still results in a deterministic order, this kind of "sequenced-washing" doesn't cause non-determinism, but it still runs the risk of having the order of elements change between JDK versions or vendors.

The locations changed in this PR have been identified by instrumenting the factory methods of the `Immutable{List,Map,Set}` classes to track whether they 1) have been created from an unsequenced collection and 2) are iterated over. I then manually picked out those cases in which I couldn't rule out that a change in order would be observable to users.

Related to google/guava#6903 (comment)

Closes #25218.

PiperOrigin-RevId: 724341188
Change-Id: I2f38a4c3dad80055cb8d80853e9f211836019e5b
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Feb 14, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has been created from a source with unspecified iteration order, such as `HashMap`. Since `ImmutableMap` itself prescribes that it preserves insertion order, it's reasonable for consumers to expect a specified, reasonable, deterministic order. While the current implementation of `HashMap` in the OpenJDK still results in a deterministic order, this kind of "sequenced-washing" doesn't cause non-determinism, but it still runs the risk of having the order of elements change between JDK versions or vendors.

The locations changed in this PR have been identified by instrumenting the factory methods of the `Immutable{List,Map,Set}` classes to track whether they 1) have been created from an unsequenced collection and 2) are iterated over. I then manually picked out those cases in which I couldn't rule out that a change in order would be observable to users.

Related to google/guava#6903 (comment)

Closes bazelbuild#25218.

PiperOrigin-RevId: 724341188
Change-Id: I2f38a4c3dad80055cb8d80853e9f211836019e5b
github-merge-queue bot pushed a commit to bazelbuild/bazel that referenced this issue Feb 18, 2025
Consumers of an `Immutable{List,Map,Set}` cannot know whether it has
been created from a source with unspecified iteration order, such as
`HashMap`. Since `ImmutableMap` itself prescribes that it preserves
insertion order, it's reasonable for consumers to expect a specified,
reasonable, deterministic order. While the current implementation of
`HashMap` in the OpenJDK still results in a deterministic order, this
kind of "sequenced-washing" doesn't cause non-determinism, but it still
runs the risk of having the order of elements change between JDK
versions or vendors.

The locations changed in this PR have been identified by instrumenting
the factory methods of the `Immutable{List,Map,Set}` classes to track
whether they 1) have been created from an unsequenced collection and 2)
are iterated over. I then manually picked out those cases in which I
couldn't rule out that a change in order would be observable to users.

Related to
google/guava#6903 (comment)

Closes #25218.

PiperOrigin-RevId: 724341188
Change-Id: I2f38a4c3dad80055cb8d80853e9f211836019e5b

Commit
b03cb63

Co-authored-by: Fabian Meumertzheim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants