-
Notifications
You must be signed in to change notification settings - Fork 886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dial is affected by query context cancellation #1259
Comments
That's an interesting problem. It looks like there are a number of home grown solutions, but it doesn't look like there is any standard solution yet. Wrapping the existing context with something that delegated Maybe the solution is a new background context that adds the original context as a value. Then it is available to anything that needs it. But they have to explicitly access the original context instead of it implicitly being used. |
I saw that but it seemed like there was some disagreement on that. Only in the case where it would've timed out would there actually be some unexpected lasting values. In the common case where it doesn't time out it would work exactly the same. Given that, it seems beneficial to as part of a trace to know that a new connection was initiated.
I'm not totally opposed to this but it would need to be documented so that anyone that wants tracing or other values would need to explicitly get them from the inner context. Overall, given the prevalence of the wrap-without-cancel approach, that seems like the better path to go, especially since most of the time it will keep the existing semantics the same and it would only change things in the failure case. I can work on a PR for this just let me know which way to go. |
As it is not something that actually affects me I will defer to you on wrap without cancel. Base the PR against the |
I was actually considering doing this in pgxpool and not puddle since I'd prefer puddle send the original context in case other users of puddle want it. This seems simple enough to do in the constructor function that we send to puddle. Does that make sense? |
Oh you're right. For some reason it totally escaped me that it could be done outside. 👍 |
We noticed a lot of:
failed to connect to ...: dial error (dial tcp ...: operation was canceled)
and this seems to stem from the fact that puddle doesp.constructResourceValue(ctx)
and passes along the context from the original query. If the query is cancelled (user closed the tab, etc), we don't necessarily want to throw away the connection.While I do think there's value in passing a context (which might contain helpful information on what originated the request and tracing information) maybe the puddle constructor (in pgxpool) should be overwriting the original context's cancellation? There are some implementations of how to ignore the cancellation out there in golang/go#40221 (Under Existing Implementations). This would make acquire-initiated connection creations match pool-health-initiated connection creattions (which just send context.Background() to
p.p.CreateResource(ctx)
).What was happening to us is that there were hundreds of these
dial error ... operation was canceled
and we were DDOSing the DB without creating any useful connections. This was during a period where the DB was already struggling and was taking longer to create connections. This issue further exacerbated the problem by causing even more load on the DB.The text was updated successfully, but these errors were encountered: