Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][client] Java Client's Seek Logic Not Threadsafe #1 #20242

Merged
merged 6 commits into from
May 8, 2023
Merged

[fix][client] Java Client's Seek Logic Not Threadsafe #1 #20242

merged 6 commits into from
May 8, 2023

Conversation

JooHyukKim
Copy link
Contributor

@JooHyukKim JooHyukKim commented May 6, 2023

Fixes #19671

Motivation

To resolve the thread-safety issue around Java client's seek logic identified by #19671.

Modifications

  • Made ConsumerImpl immediately log, cancel future and then return if duringSeek flag is not false.
    • by canceling the seekFuture, the client ensures that the current operation does not interfere with the ongoing seek operation and avoids any errors or inconsistencies that might otherwise occur.

Verifying this change

  • Make sure that the change passes the CI checks.

I am not sure if my PR went through the intended CI checks...?

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: https://github.com/JooHyukKim/pulsar/pull/1

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label May 6, 2023
@nodece
Copy link
Member

nodece commented May 6, 2023

Could you add a test for your changes?

@JooHyukKim
Copy link
Contributor Author

JooHyukKim commented May 7, 2023

Could you add a test for your changes?

Sure! Will do in bit.

Can I ask for your opinion, @nodece ?

Do you think we should...

  1. unit-test(mock everything)?
  2. initialize everything?
  3. or use reflection to extract the private method and test in isolation?

Because since the change is made in a private method I am trying to test through either of two methods,

public CompletableFuture<Void> seekAsync(long timestamp) {

public CompletableFuture<Void> seekAsync(MessageId originalMessageId) {

but I have not succeeded in finding a test I can refer to except below test from broker seems to cover too much.

public void testSeekToStart() throws Exception {

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we set:

        MessageIdAdv originSeekMessageId = seekMessageId;
        seekMessageId = (MessageIdAdv) seekId;

after the check passed?

@tisonkun
Copy link
Member

tisonkun commented May 7, 2023

@JooHyukKim BTW, edit comment won't send notification, I suggest you create new comment when you'd like to notify reviewers.

@tisonkun
Copy link
Member

tisonkun commented May 7, 2023

I'd appreciate it if you can provide a timing diagram to prove the fix. You may refer to apache/curator#430 as a typical description.

@nodece
Copy link
Member

nodece commented May 7, 2023

How to verify your changes?

You can set true for the duringSeek flag, and then call the seek method, you got a canceled future.

@JooHyukKim
Copy link
Contributor Author

@JooHyukKim BTW, edit comment won't send notification, I suggest you create new comment when you'd like to notify reviewers.

Right, thank you for the reminder 👍🏻

@JooHyukKim
Copy link
Contributor Author

I'd appreciate it if you can provide a timing diagram to prove the fix. You may refer to apache/curator#430 as a typical description.

Np 👍🏻. May I ask what tool everyone conventionally uses? Or your suggestion @tisonkun ?

@tisonkun
Copy link
Member

tisonkun commented May 7, 2023

tool

Generally, I just write in text :D

It's about:

  1. Define the actors.
  2. Describe at T_N, what actor takes what action
  3. The order of T_N
  4. What actions cause race condition

But any tool you searched is acceptable as long as the diagram is clear (with text, reviewers manually dump to diagram).

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait a bit for review from @michaeljmarshall.

This patch looks good to me.

tisonkun

This comment was marked as duplicate.

@JooHyukKim
Copy link
Contributor Author

But any tool you searched is acceptable as long as the diagram is clear (with text, reviewers manually dump to diagram).

Thank you for the tips!

@tisonkun
Copy link
Member

tisonkun commented May 8, 2023

@JooHyukKim Please fix style violation by yourself. You can use CI workflow in your personal fork to verify. Or the command should be:

mvn -B -T 8 -ntp initialize apache-rat:check license:check
mvn -B -T 8 -ntp initialize checkstyle:check

Ping me once CI passed in your personal fork.

@JooHyukKim
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@JooHyukKim
Copy link
Contributor Author

Thank you @tisonkun, I made below two check style commands pass, "locally".

My personal fork is not running CI. Is it supposed to run though? 🤔🤔

mvn -B -T 8 -ntp initialize apache-rat:check license:check
mvn -B -T 8 -ntp initialize checkstyle:check

Ping me once CI passed in your personal fork.

@tisonkun
Copy link
Member

tisonkun commented May 8, 2023

personal fork is not running CI

Yes.

You may take lhotari#145 as an example. IIRC GitHub sometimes stuck on turning on Actions..I don't have too much suggestions here but I used to follow the instructions in Actions tab to add a sample workflow and then GitHub turn on Actions.

@JooHyukKim
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@codecov-commenter
Copy link

Codecov Report

Merging #20242 (4075a39) into master (9841364) will decrease coverage by 0.02%.
The diff coverage is 51.42%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #20242      +/-   ##
============================================
- Coverage     72.87%   72.86%   -0.02%     
+ Complexity    31943     3601   -28342     
============================================
  Files          1868     1868              
  Lines        138594   138597       +3     
  Branches      15246    15247       +1     
============================================
- Hits         101003   100991      -12     
- Misses        29547    29557      +10     
- Partials       8044     8049       +5     
Flag Coverage Δ
inttests 24.21% <22.85%> (-0.05%) ⬇️
systests 24.78% <20.00%> (-0.15%) ⬇️
unittests 72.18% <51.42%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...va/org/apache/pulsar/client/impl/ConsumerImpl.java 76.40% <51.42%> (-0.03%) ⬇️

... and 64 files with indirect coverage changes

@tisonkun
Copy link
Member

tisonkun commented May 8, 2023

Merging...

Thanks for your contribution!

@tisonkun
Copy link
Member

tisonkun commented May 8, 2023

@JooHyukKim May I ask what #1 means? It looks to me as part one or first attempt that I may wonder if you have outstanding works on this topic.

@JooHyukKim
Copy link
Contributor Author

@JooHyukKim May I ask what #1 means?

Oh... 😅 "#1" means My first ever contribution to Pulsar!

I can remove the "#1" part from title if you think it will confuse the maintenance team. Just let me know~ 🙏🏼 @tisonkun

Thanks for your contribution!

@tisonkun Thank you for great guidance on your part 🙏🏼

@tisonkun tisonkun merged commit bc1764f into apache:master May 8, 2023
@tisonkun
Copy link
Member

tisonkun commented May 8, 2023

Never mind. I can edit the commit message before merging :D

Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @JooHyukKim! Would you mind updating the Javadoc for the seek and seekAsync methods to indicate what will happen if a user calls a seek method before a previous call completes? Here is a reference to one of the Javadocs:

/**
* Reset the subscription associated with this consumer to a specific message publish time.
*
* @param timestamp
* the message publish time where to reposition the subscription
* The timestamp format should be Unix time in milliseconds.
* @return a future to track the completion of the seek operation
*/
CompletableFuture<Void> seekAsync(long timestamp);

@JooHyukKim
Copy link
Contributor Author

Thanks for your contribution @JooHyukKim! Would you mind updating the Javadoc for the seek and seekAsync methods to indicate what will happen if a user calls a seek method before a previous call completes?

Will do! 🙏🏽 Later after work 😀

@Technoboy-
Copy link
Contributor

@tisonkun
For this patch, maybe you forget to set the milestone. and there are some modifications not related to the issue

@Technoboy- Technoboy- added this to the 3.1.0 milestone May 15, 2023
@tisonkun
Copy link
Member

@Technoboy- Thanks for setting up the milestone.

not related to the issue

Yep. I'm investigating some tools to unify whitespace so that we don't ask for aligning again and again.

tisonkun added a commit to tisonkun/pulsar that referenced this pull request May 15, 2023
@tisonkun tisonkun mentioned this pull request May 15, 2023
14 tasks
tisonkun added a commit that referenced this pull request May 15, 2023
poorbarcode pushed a commit that referenced this pull request May 30, 2023
poorbarcode pushed a commit that referenced this pull request May 30, 2023
This reverts commit bc1764f.

(cherry picked from commit 7b54664)
@JooHyukKim JooHyukKim deleted the 19671-consumerimpl-compare-and-set branch June 5, 2023 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Java Client's Seek Logic Not Threadsafe
7 participants