respect contexts in a more timely manner #900

whyrusleeping · 2015-03-08T22:48:23Z

This PR fixes the '7MB bursts' that @anarcat reported. While this fix works, if the lower calls 'hang', those goroutines may build up, but the contexts are cancelled so they should exit sooner rather than later. Shouldnt be an issue.

anarcat · 2015-03-08T23:19:02Z

i believe the 7.5MB bursts are documented in #872. i'll try to test this soon, if i can figure out how to test branches...

whyrusleeping · 2015-03-08T23:47:23Z

The main issue that this PR 'fixes' is that the network stack does not respect contexts. I think that if we switch over to using grpc, a lot of that will be fixed.

jbenet · 2015-03-09T04:39:14Z

exchange/bitswap/bitswap.go

+	select {
+	case <-done:
+	case <-ctx.Done():
+	}


if bs.send hangs, we exit leaking a hung goroutine.

the downside of this approach is that it becomes easy to hide hanging things, and thus harder to find problems-- nodes would just slowly deteriorate.

why is bs.send hanging? is there a way to speed that up and respect the context there, instead of bypassing it here?

For others following this along-- the codebase is full of examples of both things: places where we return asap ad let the slow-to-finish goroutine finish on its own, and places where we explicitly wait for it to be done, so that we know we've fully finished. there's tradeoffs and cases where we must do one or the other.

send is hanging because of the second stack trace in this gist: https://gist.github.com/whyrusleeping/a04c7bcf791d13e6ce73

basically comes down to our network stack not respecting contexts that well (but gRPC does!!!)

So-- when we move to gRPC we could revert this change and go back to waiting?

Also, any idea how long those goroutines wait in yamux before exiting? or what's making them wait in the first place? (worried about letting them build up indefinitely, could cause more problems)

I have no idea at all what yamux is doing. the method is waitForSendErr which selects on the error channel and the shutdown channel and exits when either even happens...

hmm wonder if it's waiting for some sort of ack from the remote side...

https://github.com/jbenet/go-peerstream/blob/master/transport/yamux/yamux.go

https://github.com/hashicorp/yamux

https://godoc.org/github.com/hashicorp/yamux

cant readily see anything im doing wrong. maybe it does wait for an ack to do flowcontrol / backpressure.

respect contexts in a more timely manner

respect contexts in a more timely manner

8ed0f4b

whyrusleeping added the status/in-progress In progress label Mar 8, 2015

whyrusleeping mentioned this pull request Mar 9, 2015

copied file cannot be found by peer #899

Closed

jbenet reviewed Mar 9, 2015
View reviewed changes

add warning comment about possibly leaked goroutines

5eb08c4

whyrusleeping force-pushed the fix/bitswap-ctx-respect branch from eead067 to 5eb08c4 Compare March 9, 2015 07:22

jbenet added a commit that referenced this pull request Mar 9, 2015

Merge pull request #900 from jbenet/fix/bitswap-ctx-respect

bcbc268

respect contexts in a more timely manner

jbenet merged commit bcbc268 into master Mar 9, 2015

jbenet removed the status/in-progress In progress label Mar 9, 2015

jbenet deleted the fix/bitswap-ctx-respect branch March 9, 2015 07:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

respect contexts in a more timely manner #900

respect contexts in a more timely manner #900

whyrusleeping commented Mar 8, 2015

anarcat commented Mar 8, 2015

whyrusleeping commented Mar 8, 2015

jbenet Mar 9, 2015

whyrusleeping Mar 9, 2015

whyrusleeping Mar 9, 2015

jbenet Mar 9, 2015

whyrusleeping Mar 9, 2015

jbenet Mar 9, 2015

jbenet Mar 9, 2015

respect contexts in a more timely manner #900

respect contexts in a more timely manner #900

Conversation

whyrusleeping commented Mar 8, 2015

anarcat commented Mar 8, 2015

whyrusleeping commented Mar 8, 2015

jbenet Mar 9, 2015

Choose a reason for hiding this comment

whyrusleeping Mar 9, 2015

Choose a reason for hiding this comment

whyrusleeping Mar 9, 2015

Choose a reason for hiding this comment

jbenet Mar 9, 2015

Choose a reason for hiding this comment

whyrusleeping Mar 9, 2015

Choose a reason for hiding this comment

jbenet Mar 9, 2015

Choose a reason for hiding this comment

jbenet Mar 9, 2015

Choose a reason for hiding this comment