-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of share_room_key
#2862
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #2862 +/- ##
=======================================
Coverage 82.37% 82.37%
=======================================
Files 211 211
Lines 21604 21607 +3
=======================================
+ Hits 17796 17799 +3
Misses 3808 3808 ☔ View full report in Codecov by Sentry. |
To be honest, I did run it through the benchmark, and yes, saw a 5% improvement. But I think that's deceptive; the real wins come when you're using indexeddb which can be a bit slow with thousands of separate transactions, and I didn't feel in the mood to try and set up a benchmark with indexeddb. |
So this is restricted to the IndexedDB scenario?
I think we should be explicit about that. Not sure what's deceptive about the criterion report, more data should always be less deceptive than less data. A 5% improvement for the memory store and SQLite store is great. Posting the above criterion report with a sentence mentioning that the improvement is even better on IndexedDB (even though we don't have a benchmark for it) would have been my ideal way of doing this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a question that we should at least ponder on for a bit. The changes make sense though there might be an alternative approach since the changes do make the API a bit more confusing due to the two Device
types.
@@ -283,14 +284,21 @@ impl GroupSessionManager { | |||
let mut share_infos = BTreeMap::new(); | |||
let mut withheld_devices = Vec::new(); | |||
|
|||
let encrypt = |device: Device, session: OutboundGroupSession| async move { | |||
let encryption_result = device.maybe_encrypt_room_key(session).await?; | |||
// XXX is there a way to do this that doesn't involve cloning the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is, you are moving the Device
into a spawned task. The task itself might be moved between threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my thinking went: we have an Arc<CryptoStoreWrapper>
which is passed into this method, and which we know will live as long as any of those tasks (because the tasks are all spawned and join_all
-ed inside this method). So shouldn't I be able to borrow a reference to the CryptoStoreWrapper
contained in that Arc, and just pass a copy of that reference into each spawned task?
let user_devices = self.store.get_user_devices_filtered(user_id).await?; | ||
let user_devices = self.store.get_readonly_devices_filtered(user_id).await?; | ||
|
||
// We only need the user identity if settings.only_allow_trusted_devices is set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this is the part that improved the performance of things? I wonder if a simple cache, so we only load the device_owner_identity
once would have given us the same performance improvement as well.
Presumably no, since the memory store shows a 5% improvement as well. Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the performance improvement. It's partly not requesting the owning user's identity, and partly not requesting our own identity on each iteration of the loop.
I don't disagree that it's confusing having two Device types. But honestly a lot of me feels that horse has already bolted: we have the two types and have to deal with them. I think I'd argue that the high-level |
Noted, by the way. Will try to be more proactive in including benchmark reports in PRs where they are relevant. |
Until #2862, `GroupSessionManager::encrypt_session_for` called `Device::encrypt` (via `Device::maybe_encrypt_room_key`. `Device::encrypt` is a thin wrapper for `ReadOnlyDevice::encrypt`, but it also has an `instrument` annotation. #2862 short-cuts `Device::encrypt` and calls `ReadOnlyDevice::encrypt` directly: that was functionally fine but of course means that we no longer benefit from the `instrument` annotation. This PR rectifies the situation by pushing the annotation down to `ReadOnlyDevice::encrypt`.
Until #2862, `GroupSessionManager::encrypt_session_for` called `Device::encrypt` (via `Device::maybe_encrypt_room_key`. `Device::encrypt` is a thin wrapper for `ReadOnlyDevice::encrypt`, but it also has an `instrument` annotation. #2862 short-cuts `Device::encrypt` and calls `ReadOnlyDevice::encrypt` directly: that was functionally fine but of course means that we no longer benefit from the `instrument` annotation. This PR rectifies the situation by pushing the annotation down to `ReadOnlyDevice::encrypt`. It also adds some documentation for that function, since we are using it in more places now. (Longer-term, I think we should probably aim to get rid of `Device::encrypt` altogether, but that's a refactor I don't want to take on today.)
Encryption doesn't usually require access to the user identity, so we can skip loading that from the store.
Unfortunately that means a fair bit of refectoring, in the form of replacing
Device
withReadOnlyDevice
.This reduces the time taken to encrypt a message in a large room from about 3 seconds to 1 second.