-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster sha256 computation #248
Conversation
The computation is now done concurrently to the tarball creation. Signed-off-by: David Gageot <[email protected]>
@dgageot How are you you testing this when you see a 30% performance. When I did a quick and dirty benchmark the times were slightly faster without the ConcurrentHasher. |
On a Mac, I created a 1.2Gb tarball from my go-build folder. I created the archive multiple times and ignored that first result. I'll try to run a better benchmark. |
I ran the tests again, here's what I got: I'm taring a copy of my master branch
with this PR
|
@dgageot I was able to reproduce your results and performance gains. There is one tradeoff worth noting. While this implementation ran faster in real time on my workstation the actual CPU time required by this implementation is higher. Therefore, in CPU bound environments, this change might actually degrade performance. That being said I think this is probably the right tradeoff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a minimal unit test ensuring that ConcurrentHasher.Sum
returns identical to the underlying Hash.Sum
. It looks correct right now, but a test will prevent someone from accidentally breaking it in the future.
archive/tar.go
Outdated
return fmt.Sprintf("sha256:%x", hasher.Sum(nil)), nil | ||
} | ||
|
||
type ConcurrentHasher struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind moving the hasher implementation into a new file (e.g. hash.go
)? I think that small organizational change would make both the hashing and the tarring logic more readable.
Signed-off-by: David Gageot <[email protected]>
@ekcasey This should be better now. Thanks for your time! |
The computation of the sh256 digest is now done concurrently to the tarball creation.
This gives a 30% boost on my tests (with a hot cache)
Signed-off-by: David Gageot [email protected]