-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Break added documents into batches when processing solution/compilation translation states. #72621
Conversation
// Note: we intentionally kick these off in a fire-and-forget fashion. If we get canceled, all | ||
// the tasks will attempt to cancel. If we complete, that's only because these tasks would | ||
// complete as well. There's no need to track this with any sort of listener as this work just | ||
// acts to speed up getting to a compilation, but is otherwise unobservable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we were already doing this when we got to the add-docuemtns action, which already processes thigns in parallel. so this didn't seem to actually help as much as i wanted, and it meant a lot of task explosion as we were processing a long action queue.
@@ -9,6 +9,7 @@ | |||
using System.Diagnostics.CodeAnalysis; | |||
using System.Linq; | |||
using System.Runtime.CompilerServices; | |||
using System.Runtime.InteropServices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was this added for?
src/Workspaces/Core/Portable/Workspace/Solution/SolutionCompilationState.cs
Outdated
Show resolved
Hide resolved
var trees = await Task.WhenAll(tasks).ConfigureAwait(false); | ||
currentCompilation = currentCompilation.AddSyntaxTrees(trees); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not want to collect all the trees separately, and then add them at the end?
for (int i = batchStart, batchEnd = Math.Min(batchStart + AddDocumentsBatchSize, total); i < batchEnd; i++) | ||
{ | ||
var document = this.Documents[i]; | ||
tasks.Add(Task.Run(async () => await document.GetSyntaxTreeAsync(cancellationToken).ConfigureAwait(false), cancellationToken)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Consider ignoring cancellation exceptions in these tasks (always returning null instead), and then check for cancellation after the WhenAll
completes to avoid throwing more exceptions than necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using System.Collections.Immutable; | ||
using System.Linq; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't spot what required this.
|
||
for (int i = batchStart, batchEnd = Math.Min(batchStart + AddDocumentsBatchSize, total); i < batchEnd; i++) | ||
{ | ||
var document = this.Documents[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One particularly slow GetSyntaxTreeAsync call here could completely block any new work from starting. What about using a semaphore to do the parallelism? Something like:
var throttler = new SemaphoreSlim(initialCount: AddDocumentsBatchSize);
var tasks = this.Documents.Select(async document =>
{
await throttler.WaitAsync(cancellationToken).ConfigureAwait(false);
try
{
return await document.GetSyntaxTreeAsync(cancellationToken).ConfigureAwait(false);
}
finally
{
throttler.Release();
}
});
var batch = await Task.WhenAll(tasks).ConfigureAwait(false);
trees.AddRange(batch);
var semaphore = new SemaphoreSlim(initialCount: AddDocumentsBatchSize); | ||
foreach (var document in this.Documents) | ||
{ | ||
tasks.Add(Task.Run(async () => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasonmalinowski For review when you get back. |
I noticed high overhead due to how many forks we make of all data when we linearize each added document into a different state. Batching heavily removes that overhead, while also allowing us to still benefit from parallelization in many cases.