Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Execute the pending callbacks in order before ready for incoming requests #23266

Conversation

BewareMyPower
Copy link
Contributor

Background Knowledge

CompletableFuture has an intuitive behavior

        final var future = new CompletableFuture<Integer>();
        future.thenRun(() -> System.out.println("A"));
        future.thenRun(() -> System.out.println("B"));
        future.thenRun(() -> System.out.println("C"));
        future.complete(0);

The outputs of the code above are:

C
B
A

That's because it maintains callbacks in the LIFO stack, not FIFO queue.

Motivation

#22977 breaks the order of some events during extensible load manager's start by adding the runnable objects via thenRun(). The previous events order of ExtensibleLoadManagerImpl:

  1. playLeader() or playFollower()
  2. serviceUnitStateChannel.start()
  3. Schedule some tasks (brokerLoadDataReportTask, topBundlesLoadDataReportTask, etc.) and set started with true.

Now, since they will be executed in reverse order, started will first be set true and then event 1 and 2 happened. It might cause unexpected issues

Modifications

Add a synchronized list to queue the pending tasks and execute them in order before the future is complete.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Sep 6, 2024
@BewareMyPower BewareMyPower self-assigned this Sep 6, 2024
@BewareMyPower BewareMyPower added type/bug The PR fixed a bug or issue reported a bug area/broker labels Sep 6, 2024
@BewareMyPower BewareMyPower added this to the 4.0.0 milestone Sep 6, 2024
Copy link
Contributor

@heesung-sn heesung-sn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. I think we also have other pulsar logics, e.g. FutureUtil.sequencer that uses future chain for ordering. I wonder if we need to revisit that too.

@lhotari
Copy link
Member

lhotari commented Sep 6, 2024

Nice catch. I think we also have other pulsar logics, e.g. FutureUtil.sequencer that uses future chain for ordering. I wonder if we need to revisit that too.

Yes, nice catch @BewareMyPower.

It's useful to consider replacing FutureUtil.sequencer usage with com.spotify.futures.ConcurrencyReducer with the concurrency of 1. ConcurrencyReducer is already included as a dependency in Pulsar as part of com.spotify:completable-futures.

@heesung-sn
Copy link
Contributor

Actually, I see that the sequencer uses thenCompose, which I think the order should be fifo.

@lhotari
Copy link
Member

lhotari commented Sep 6, 2024

Actually, I see that the sequencer uses thenCompose, which I think the order should be fifo.

I think it's different.

if (sequencerFuture.isDone()) {
if (sequencerFuture.isCompletedExceptionally() && allowExceptionBreakChain) {
return sequencerFuture;
}
return sequencerFuture = newTask.get();
}
return sequencerFuture = allowExceptionBreakChain
? sequencerFuture.thenCompose(__ -> newTask.get())
: sequencerFuture.exceptionally(ex -> null).thenCompose(__ -> newTask.get());

the instances are chained and .thenCompose isn't called multiple times for the same instance.

@BewareMyPower
Copy link
Contributor Author

The sequencer is different because it's a chained futures (like linked list) that each future has only 1 callback in its internal stack while the existing design is a single future with N callbacks. However, the sequencer might be unnecessarily complicated for this simple case so I won't adopt it.

@BewareMyPower
Copy link
Contributor Author

I found a deadlock in ClusterMigrationTest. Mark this PR as drafted.

@BewareMyPower BewareMyPower marked this pull request as draft September 7, 2024 15:04
@BewareMyPower BewareMyPower marked this pull request as ready for review September 7, 2024 16:00
@lhotari
Copy link
Member

lhotari commented Sep 7, 2024

Good work @BewareMyPower !

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.55%. Comparing base (bbc6224) to head (12b0c60).
Report is 566 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23266       /-   ##
============================================
  Coverage     73.57%   74.55%    0.97%     
- Complexity    32624    33779     1155     
============================================
  Files          1877     1926       49     
  Lines        139502   145056     5554     
  Branches      15299    15864      565     
============================================
  Hits         102638   108141     5503     
  Misses        28908    28646     -262     
- Partials       7956     8269      313     
Flag Coverage Δ
inttests 27.86% <100.00%> ( 3.28%) ⬆️
systests 24.67% <100.00%> ( 0.34%) ⬆️
unittests 73.90% <100.00%> ( 1.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...n/java/org/apache/pulsar/broker/PulsarService.java 83.82% <100.00%> ( 1.45%) ⬆️

... and 556 files with indirect coverage changes

@BewareMyPower BewareMyPower merged commit ca0fb44 into apache:master Sep 8, 2024
51 checks passed
@BewareMyPower BewareMyPower deleted the bewaremypower/fix-extensible-lm-start-order branch September 8, 2024 13:30
lhotari pushed a commit that referenced this pull request Sep 9, 2024
lhotari pushed a commit that referenced this pull request Sep 9, 2024
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 10, 2024
… incoming requests (apache#23266)

(cherry picked from commit ca0fb44)
(cherry picked from commit ca8d724)
michalcukierman pushed a commit to michalcukierman/pulsar that referenced this pull request Sep 11, 2024
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 12, 2024
… incoming requests (apache#23266)

(cherry picked from commit ca0fb44)
(cherry picked from commit ca8d724)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants