Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-4595] Immediate connection on database created #257

Merged
merged 4 commits into from
Jun 20, 2024

Conversation

dragomirp
Copy link
Contributor

@dragomirp dragomirp commented Jun 19, 2024

Issue

Pgbouncer is not set up to receive connections immediately when publishing credentials

Solution

Sync up configuration and reload PGB when establishing the interface

Related to #245

Copy link

codecov bot commented Jun 19, 2024

Codecov Report

Attention: Patch coverage is 73.91304% with 6 lines in your changes missing coverage. Please review.

Project coverage is 70.79%. Comparing base (0dbd858) to head (3b855b0).

Files Patch % Lines
src/relations/pgbouncer_provider.py 73.68% 3 Missing and 2 partials ⚠️
src/charm.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #257      +/-   ##
==========================================
+ Coverage   70.48%   70.79%   +0.31%     
==========================================
  Files           7        7              
  Lines        1162     1178      +16     
  Branches      205      209       +4     
==========================================
+ Hits          819      834      +15     
- Misses        266      268       +2     
+ Partials       77       76       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 170 to 171
self.charm.render_pgb_config(reload_pgbouncer=True)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is enough to fix the immediate issue, but it won't work if we have multiple client units.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but it won't work if we have multiple client units.

It is subordinate... 1:1 relation here. Nope?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PGB will be 1:1 for the client units, but it's not a given that we'll have a single client unit. Say we have a client with two units set up like so: c0→pgb0 and c1→pgb1, when establishing the relation pgb1 will create the user and db, update the db mapping, reload pgb and trigger a database_created event for the client. While pgb1 may be good to go, at the time database_created triggers, pgb0 may not yet be fully set up, it may not have synced up the dbs from peer data, it may still be setuping the backend relation. If c0 tries to use pgb0 while it's still setting up, the connection will error out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fix related to #245?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fix related to #245?

Yes

Comment on lines 125 to 131
for key in self.charm.peers.relation.data.keys():
if key != self.charm.app:
if self.charm.peers.relation.data[key].get("auth_file_set", "false") != "true":
logger.info("Backend credentials not yet set on all units")
event.defer()
return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure all the units have the userlist rendered.

Comment on lines 104 to 112
def _on_relation_changed(self, event: RelationChangedEvent) -> None:
"""Injects the database name for follower units."""
if not self.charm.unit.is_leader():
if "database" in event.relation.data[event.app]:
database = event.relation.data[event.app]["database"]
dbs = self.charm.generate_relation_databases()
if database not in [db["name"] for db in dbs]:
self.charm.render_pgb_config(reload_pgbouncer=True, inject_db=database)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bypassing the lib to inject the requested database in follower pgbouncers. This us needed because _on_database_requested will only fire on the leader and followers will need to wait for peer data hooks to fire to catch up. Alternatively, I can try to checksum the dblists in peer data and defer _on_database_requested until all units are in sync. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's better to wait for all units to be in sync because this event can still be in the follower unit queue (not executed yet) while the leader has already set the connection info in the client relation data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another important thing for the current implementation of this PR, in my opinion, is to ensure that other calls to render_pgb_config in the follower units are safe and don't remove the injected database after it was added at this point. I mean the situation where this call injects the database, another queued event calls render_pgb_config, which removes the injected database, and later on, the follower unit receives the peer-relation-changed event (because the leader took some time to set the peer relation data).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a go to sync configs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked to hash db list and defer until all units are in sync.

@dragomirp
Copy link
Contributor Author

This seems to cause instability on juju 2 scale up, but I want early feedback since the changes are aggressive.

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one point.

Comment on lines 170 to 171
self.charm.render_pgb_config(reload_pgbouncer=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but it won't work if we have multiple client units.

It is subordinate... 1:1 relation here. Nope?

dragomirp and others added 2 commits June 19, 2024 17:05
* Sync configs between units

* Remove readonly db hash
@dragomirp dragomirp marked this pull request as ready for review June 19, 2024 19:46
Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

640KB^w 16 digits will be enough for everyone!

LGTM.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks, Dragomir!

@dragomirp dragomirp merged commit 4c0e252 into main Jun 20, 2024
49 checks passed
@dragomirp dragomirp deleted the dpe-4595-slow-config branch June 20, 2024 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants