Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddtrace/tracer: send payloads asynchronously #549

Merged
merged 5 commits into from
Jan 10, 2020
Merged

Conversation

knusbaum
Copy link
Contributor

This patch moves the flushing of payloads to the agent into its own goroutine.
This avoids blocking the rest of the tracer, decreasing the liklihood of dropping traces
due to network latency.

@knusbaum knusbaum added this to the 1.21.0 milestone Dec 17, 2019
@knusbaum knusbaum force-pushed the knusbaum/async-send branch from 5b338c8 to b2678c4 Compare December 17, 2019 00:16
@knusbaum
Copy link
Contributor Author

Hmmm. Dependencies are failing. I'll check this out tomorrow.

Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't think we need a different worker routine. With i/o, it's fine to have many goroutines. I already implemented this as part of the experimental tracer I am working on and I was hoping you'd use the same technique: https://github.com/DataDog/dd-trace-go/blob/v0.10.0/ddtrace/tracer/tracer.go#L320-L336. Can you maybe see what you think? The technique I've used is similar to what we've been doing in the agent and has been working very well with many repetitive flushes and high traffic.

Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the triple review, more came out :)

Copy link
Contributor Author

@knusbaum knusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gbbr

Personally I don't think we need a different worker routine. With i/o, it's fine to have many goroutines.

Sounds good to me. I see you're limiting the goroutines with a channel, so I don't see a reason to have a separate worker.

Edited: deleted long description of why we should limit the number of goroutines we're spawning. I missed the fact that that is already implemented here:

t.climit <- struct{}{}
t.wg.Add(1)
go func(p *payload) {
defer func() {
<-t.climit
t.wg.Done()
}()
size, count := p.size(), p.itemCount()
log.Debug("Sending payload: size: %d spans: %d\n", size, count)
rc, err := t.config.transport.send(p)
if err != nil {
log.Error("lost %d spans: %v", count, err)
}
if err == nil {
t.prioritySampling.readRatesJSON(rc) // TODO: handle error?
}
}(t.payload)

@knusbaum knusbaum force-pushed the knusbaum/async-send branch from b2678c4 to 02b4a67 Compare December 19, 2019 22:37
@knusbaum knusbaum marked this pull request as ready for review December 19, 2019 22:38
@gbbr gbbr changed the title ddtrace/tracer: Send payloads asynchronously ddtrace/tracer: send payloads asynchronously Dec 20, 2019
@knusbaum knusbaum force-pushed the knusbaum/async-send branch 2 times, most recently from 14886ae to 2ef338c Compare December 24, 2019 14:21
Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kyle. I think this is looking good. I would like us to push a bit further and clean up that confirm channel and change forceFlush a bit. It will result in a better overall testing framework and cleaner code.

FWIW, if it helps, this is what I did in an experiment I ran https://github.com/DataDog/dd-trace-go/blob/v2-alpha/ddtrace/tracer/tracer_test.go#L983-L1005

@knusbaum knusbaum force-pushed the knusbaum/async-send branch from 3a8642d to 0a86e8e Compare January 6, 2020 21:57
Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kyle. I really like how this turned out. Tests are much more readable and code has also turned out cleaner. The async flushing will also make a big difference to memory usage and data integrity in many cases.

Please bear with me while I raise some questions and make some suggestions.

@knusbaum knusbaum force-pushed the knusbaum/async-send branch from a501ea3 to 3ebd601 Compare January 7, 2020 19:31
gbbr
gbbr previously approved these changes Jan 8, 2020
Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ace! 👌

Copy link
Contributor

@gbbr gbbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@knusbaum knusbaum merged commit 90ac9c8 into v1 Jan 10, 2020
@knusbaum knusbaum deleted the knusbaum/async-send branch January 14, 2020 22:26
mingrammer pushed a commit to mingrammer/dd-trace-go that referenced this pull request Dec 22, 2020
This commit moves the flushing of payloads into separate goroutines.
This avoids blocking the rest of the tracer, decreasing the likelihood of dropping traces
due to network latency.
@gbbr
Copy link
Contributor

gbbr commented Aug 10, 2021

Link #475

gbbr added a commit that referenced this pull request Aug 10, 2021
This change removes the `(*payload).waitClose` mechanism added in #475
because it is no longer necessary since #549, where we've stopped
reusing payloads and started sending them async.

The change also removes the `(*payload).reset` method implementation to
further emphasise that this type of use is discouraged.

See also https://github.com/golang/go/blob/go1.16/src/net/http/client.go#L136-L138
gbbr added a commit that referenced this pull request Aug 10, 2021
This change removes the `(*payload).waitClose` mechanism added in #475
because it is no longer necessary since #549, where we've stopped
reusing payloads and started sending them async.

The change also removes the `(*payload).reset` method implementation to
further emphasise that this type of use is discouraged.

See also https://github.com/golang/go/blob/go1.16/src/net/http/client.go#L136-L138
gbbr added a commit that referenced this pull request Aug 13, 2021
This change removes the `(*payload).waitClose` mechanism added in #475
because it is no longer necessary since #549, where we've stopped
reusing payloads and started sending them async.

The change also removes the `(*payload).reset` method implementation to
further emphasise that this type of use is discouraged.

Additionally, the Close call now resets the buffer to ensure it is garbage collected after use,
regardless of still being referenced or not.

See also https://github.com/golang/go/blob/go1.16/src/net/http/client.go#L136-L138
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants