Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue 12040][broker] Fix advertised listener confusion #12353

Merged

Conversation

EronWright
Copy link
Contributor

@EronWright EronWright commented Oct 13, 2021

Master Issue: #12040

Motivation

A couple of PRs (#10312 and #10961) caused an unintended change to the broker address that is used in various scenarios. There are three important cases to consider:

  1. the advertised broker service URL(s) (for pulsar:// and pulsar+ssl://)
  2. the advertised web service URL (for http:// and https://)
  3. the canonical name of the broker for identification purposes, e.g. bundle ownership stored in ZK

Note that the advertisedListeners configuration property is designed to support the broker service URL only, not the web service URL. Support for multiple web addresses will be addressed later.

The first PR unintentionally caused the web service URL to be derived from advertisedListeners. This caused #10951 and led to the second PR. The second PR introduced a parameter ignoreAdvertisedListener with some undesirable effects:

  1. Causes the canonical broker name to be based on the internal advertisedListener though it is historically based on the web service URL. For example, the NamespaceService uses pulsar.getSafeWebServiceAddress() as the broker name.
  2. Fails to use the advertisedListeners in computing PulsarService::brokerUrl, which is used for intra-cluster communication (eg. by the function worker).
  3. Breaks the compactor tool in combination with advertisedListeners.

Modifications

This PR cleans up the code to use the correct identifier for each case.

For the broker service URL, a helper method is introduced (ServiceConfigurationUtils::getInternalListener) to obtain the endpoint to be used for intra-cluster connectivity. This is designed to work in all cases, even when advertisedListeners isn't configured. Also, the use of ServiceConfiguration::brokerServicePort was minimized for interoperability with #12079.

For the web service URL, a helper method is introduced (ServiceConfigurationUtils::getWebServiceAddress) to encapsulate the logic and to help with future enhancements. This PR re-enables the use of advertisedAddress to configure the web service URL, independent of the advertisedListeners.

For the broker canonical name, the codebase typically use PulsarService::getAdvertisedAddress. The value is simply a hostname and should never be used to connect to the broker; it is strictly for identification purposes. Historically the value is derived from the web service URL, not the broker service URL. For that reason, this PR doesn't use advertisedListeners when computing the canonical name.

To test the decoupling of advertised port from listen port, a test case was enhanced to use a simple port forwarder.

Specific changes:

  • deprecate ServiceConfigurationUtils::getAppliedAdvertisedAddress
  • introduce ServiceConfigurationUtils::getWebServiceAddress
  • introduce ServiceConfigurationUtils::getInternalListener
  • relax the limitation that advertisedAddress and advertisedListener cannot be used at the same time
  • clarify the purpose of PulsarService::advertisedAddress
  • relax validation of brokerServicePort (it is not needed in pure-TLS configurations)
  • stabilize the ordering of advertised listeners
  • create a utility class that implements port-forwarding for test purposes

Compatibility

This PR will impact installations of Pulsar 2.8.1 when advertisedListeners are configured. This is because #10961 was included in 2.8.1 and has various effects (as mentioned above). This PR will restore the original behavior. A possible mitigation strategy is to set the advertisedAddress configuration property, in addition to advertisedListeners. The advertisedAddress property is the basis for the broker canonical name; set it to the hostname of the internal advertised listener to avoid a change to the effective values when upgrading from 2.8.1 to 2.9.

Followups

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

Unit tests were added or updated to:

  • org.apache.pulsar.broker.PulsarServiceTest
  • org.apache.pulsar.client.api.PulsarMultiListenersWithInternalListenerNameTest

Does this pull request potentially affect one of the following parts:

  • Anything that affects deployment: yes

Documentation

Check the box below and label this PR (if you have committer privilege).

Need to update docs?

  • doc-required

- deprecate ServiceConfigurationUtils::getAppliedAdvertisedAddress
- clarify the purpose of PulsarService::advertisedAddress
- introduce a helper method for webServiceAddress
- stabilize the ordering of advertised listeners
@EronWright EronWright changed the title [Issue 12040][broker] Fix advertised listeners [Issue 12040][broker] Fix advertised listener confusion Oct 13, 2021
- checkstyle
@@ -216,6 +216,9 @@
private PulsarClient client = null;
private ZooKeeperClientFactory zkClientFactory = null;
private final String bindAddress;
/**
* The host component of the broker's canonical name.
*/
private final String advertisedAddress;
Copy link
Contributor Author

@EronWright EronWright Oct 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am inclined to deprecate the advertisedAddress in favor of a formal broker name property. In a follow-up PR, we could introduce a BrokerName type (as opposed to a String) to help with refactoring. For this PR, I left advertisedAddress as-is but clarified its purpose.

@@ -936,7 +935,7 @@ private void updateLoadBalancingMetrics(final SystemResourceUsage systemResource
List<Metrics> metrics = Lists.newArrayList();
Map<String, String> dimensions = new HashMap<>();

dimensions.put("broker", ServiceConfigurationUtils.getAppliedAdvertisedAddress(conf, true));
dimensions.put("broker", pulsar.getAdvertisedAddress());
Copy link
Contributor Author

@EronWright EronWright Oct 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the broker dimension represents the broker canonical name, not any particular network address. The effective value is unchanged.

-bugfix
- updated tests
- remove obsolete validation check
…pplied-listener

# Conflicts:
#	pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfigurationUtils.java
@Anonymitaet Anonymitaet added the doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. label Oct 14, 2021
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Anonymitaet
Copy link
Member

Hi @EronWright since the due date of 2.9.0 went past, do you still want this PR to scope in 2.9.0?

image

@EronWright
Copy link
Contributor Author

@Anonymitaet yes I believe it an important fix for 2.9.

@EronWright
Copy link
Contributor Author

EronWright commented Oct 14, 2021

I am fixing the Broker Client test failure. The issue is that this PR fixes the broker address in PulsarService::brokerUrl to be the advertised listener address (which is correct since it is used to configure Pulsar clients). The failing test uses a static listener configuration while trying to use a dynamic bind address, leading to Connection refused.

Update: fixed by introducing a simple port forwarder into the test setup to forward traffic from the advertised port to the listen port. This makes the test stronger because we're testing the independence of those two ports.

@codelipenghui
Copy link
Contributor

/pulsarbot run-failure-checks

- add a utility class for local port-forwarding
- enhance the MockedPulsarServiceBaseTest to port-forward from advertised port to listen port
- fix the PulsarMultiListenersWithInternalListenerNameTest
@codelipenghui
Copy link
Contributor

@315157973 Please help review the PR since you have more context about this part.

@@ -38,8 +38,7 @@
public final class MultipleListenerValidator {

/**
* validate the configure of `advertisedListeners`, `internalListenerName`, `advertisedAddress`.
* 1. `advertisedListeners` and `advertisedAddress` must not appear together.
* Validate the configuration of `advertisedListeners`, `internalListenerName`.
* 2. the listener name in `advertisedListeners` must not duplicate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to change the java doc here, starting from 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomscut can you please send a PR to fix this ?
I am going to merge this patch now and prepare 2.9.0 release soon

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomscut can you please send a PR to fix this ? I am going to merge this patch now and prepare 2.9.0 release soon

Yes, I sent a PR #12389, PTAL. Thank you.

@eolivelli eolivelli merged commit 6382be8 into apache:master Oct 15, 2021
@codelipenghui codelipenghui deleted the ewright/pip95-fix-applied-listener branch October 15, 2021 15:23
codelipenghui pushed a commit that referenced this pull request Oct 18, 2021
### Motivation

*Fix the java doc mentioned in [#12353](https://github.com/apache/pulsar/pull/12353/files#r729887373)*

### Modifications

*Fix java doc for MultipleListenerValidator#validateAndAnalysisAdvertisedListener.*
codelipenghui pushed a commit that referenced this pull request Dec 20, 2021
### Motivation

*Fix the java doc mentioned in [#12353](https://github.com/apache/pulsar/pull/12353/files#r729887373)*

### Modifications

*Fix java doc for MultipleListenerValidator#validateAndAnalysisAdvertisedListener.*

(cherry picked from commit 9d10b8b)
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
* [Issue 12040]
- deprecate ServiceConfigurationUtils::getAppliedAdvertisedAddress
- clarify the purpose of PulsarService::advertisedAddress
- introduce a helper method for webServiceAddress
- stabilize the ordering of advertised listeners
- add a utility class for local port-forwarding
- enhance the MockedPulsarServiceBaseTest to port-forward from advertised port to listen port
- fix the PulsarMultiListenersWithInternalListenerNameTest

Co-authored-by: penghui <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker doc Your PR contains doc changes, no matter whether the changes are in markdown or code files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants