Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations during checksum creation. #76524

Merged
merged 2 commits into from
Jan 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ internal readonly partial record struct Checksum
private static readonly ObjectPool<XxHash128> s_incrementalHashPool =
new(() => new(), size: 20);

// Pool of ObjectWriters to reduce allocations. The pool size is intentionally small as the writers are used for such
// a short period that concurrent usage of different items from the pool is infrequent.
private static readonly ObjectPool<ObjectWriter> s_objectWriterPool =
new(() => new(SerializableBytes.CreateWritableStream(), leaveOpen: true, writeValidationBytes: true), size: 4);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why size:4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the object is used for such a short period, that it's actually pretty unlikely to have concurrent usage of different items from the pool. I'll add a comment indicating such.


public static Checksum Create(IEnumerable<string?> values)
{
using var pooledHash = s_incrementalHashPool.GetPooledObject();
Expand Down Expand Up @@ -57,15 +62,25 @@ public static Checksum Create(Stream stream)

public static Checksum Create<T>(T @object, Action<T, ObjectWriter> writeObject)
{
using var stream = SerializableBytes.CreateWritableStream();
// Obtain a writer from the pool
var objectWriter = s_objectWriterPool.Allocate();

using (var objectWriter = new ObjectWriter(stream, leaveOpen: true))
{
writeObject(@object, objectWriter);
}
// Invoke the callback to Write object into objectWriter
writeObject(@object, objectWriter);

// Include validation bytes in the new checksum from the stream
var stream = objectWriter.BaseStream;
stream.Position = 0;
return Create(stream);
var newChecksum = Create(stream);

// Reset object writer back to it's initial state, including the validation bytes
objectWriter.Reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does writeValidationBytes then work properly? do we basically check if we're at pos0 and wrrite out those bytes (i genuinely don't remember).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in commit 2 to just rewrite the validation bytes after the reset instead of having the reset set the position to after the validation bytes. Probably easier to understand that way.

objectWriter.WriteValidationBytes();

// Release the writer back to the pool
s_objectWriterPool.Free(objectWriter);

return newChecksum;
}

public static Checksum Create(Checksum checksum1, Checksum checksum2)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,12 @@ public readonly void Dispose()
}
}

public void Reset()
{
_valueToIdMap.Clear();
_nextId = 0;
}

public bool TryGetReferenceId(string value, out int referenceId)
=> _valueToIdMap.TryGetValue(value, out referenceId);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,28 @@ public void Dispose()
public void WriteUInt16(ushort value) => _writer.Write(value);
public void WriteString(string? value) => WriteStringValue(value);

public Stream BaseStream => _writer.BaseStream;

public void Reset()
{
_stringReferenceMap.Reset();

// Reset the position and length back to zero
_writer.BaseStream.Position = 0;

if (_writer.BaseStream is SerializableBytes.ReadWriteStream pooledStream)
{
// ReadWriteStream.SetLength allows us to indicate to not truncate, allowing
// reuse of the backing arrays.
pooledStream.SetLength(0, truncate: false);
}
else
{
// Otherwise, set the new length via the standard Stream.SetLength
_writer.BaseStream.SetLength(0);
}
}

/// <summary>
/// Used so we can easily grab the low/high 64bits of a guid for serialization.
/// </summary>
Expand Down
Loading