-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support async cancellations. #642
Comments
Hi Zac, Good to hear from you.
It sounds like one, yup. I'd expect there to be plenty of subtleties around this... How can we start narrowing this down? What broken state can we end up in? Are we able to replicate this in a test case at all? |
(sorry, this dropped into the holiday void; followups should be faster) Pulling out https://github.com/encode/httpcore/blob/f0657cb43cb707d1672b93b61bb53b4cfb166820/httpcore/_async/connection_pool.py#LL226-L234C30 as a self-contained example: try:
connection = await status.wait_for_connection(timeout=timeout)
except BaseException as exc:
# If we timeout here, or if the task is cancelled, then make
# sure to remove the request from the queue before bubbling
# up the exception.
async with self._pool_lock:
self._requests.remove(status)
raise exc Suppose that we're running this task inside a Trio nursery or anyio taskgroup, which gets cancelled while we The impact in this case is that we leak This is easily solved by wrapping So my suggested solution is to add |
Hello, with trio.move_on_after(5):
response = await client.get(url) This snippet is called regularly with the same client. After some time a What happened? If trio.Cancelled is called at the time of sending the request and waiting for the response, the connection gets to an unusable state. At the following Point, the trio.Cancelled Exception is catched by BaseException. The asynchronous code in the block will then not be executed. The connection is not cleared and all connections in the connection pool are occupied after a while. httpcore/httpcore/_async/http11.py Lines 113 to 116 in 95406e6
A PoolTimeout then occurs when a new request is made, since no more connections are available. This behavior can be shown in this example: async def bar():
await trio.sleep(0.5)
print("World")
async def foo():
try:
print("Hello")
await trio.sleep(2)
except BaseException as ex:
await bar()
async def main():
with trio.move_on_after(1):
await foo()
trio.run(main)
# >>> Hello The example prints 'Hello' but not 'World' |
See also
So |
The problem with the shielded cancelscope is that when Furthermore, if then aclose for whatever reason, needs for example 2 seconds in my use case, the timeout I have set in the example to 5 seconds is now at one time at 7 seconds. So you need a timeout and the shielded cancelscope: with trio.move_on_after(CLEANUP_TIMEOUT) as cleanup_scope:
cleanup_scope.shield = True
await aclose() The question here is, which value is sensible to choose for CLEANUP_TIMEOUT |
In this case I'd probably go for The proper solution is to ensure that if |
I have not dug too deep into this but maybe it is possible to split the connection pool logic (i.e. whether or not to return the connection to the pool) from async stuff? Also when a httpcore/httpcore/_async/http11.py Lines 113 to 116 in 95406e6
httpcore/httpcore/_async/http11.py Lines 216 to 228 in 95406e6
are not needed anyways and the connection should always be closed? |
Closed by #726 |
Over at @anthropics we're keen users of httpx on Trio, and I've noticed that we sometimes have issues where cancelling a task doesn't get all the teardown logic right. For example, if you try to
await
(orasync for/with
) in anexcept BaseException:
orfinally:
block, Trio will immediately re-raise aCancelled
exception instead of running your teardown logic.This can be pretty subtle, since it's happening as you abandon a task anyway, but can cause various subtle or unsubtle problems that motivated us to build flake8-trio, including the
TRIO102
error for this case. I'm pretty sure that httpcore has multiple such issues, e.g.httpcore/httpcore/_async/connection_pool.py
Lines 251 to 253 in f0657cb
doesn't look cancel-safe to me. Do you consider "this doesn't close the connection properly when cancelled" a bug? If so, please consider this a fairly general bug-report!
(I've also found it infeasible to consistently get this right without tool support. If you want to try flake8-trio many of the checks are framework-agnostic; and if it'd be useful for httpcore and httpx we'd be happy to add
anyio
support to the linter 🙂)The text was updated successfully, but these errors were encountered: