Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

MGLMapSnapshotter not thread-safe, can cause crashes #11827

Closed
ChrisLoer opened this issue May 3, 2018 · 4 comments
Closed

MGLMapSnapshotter not thread-safe, can cause crashes #11827

ChrisLoer opened this issue May 3, 2018 · 4 comments
Assignees

Comments

@ChrisLoer
Copy link
Contributor

MGLMapSnapshotter does image generation work on a background thread, using a pointer to self captured from the foreground:

dispatch_async(workQueue, ^{
#if TARGET_OS_IPHONE
MGLAttributionInfoStyle attributionInfoStyle = MGLAttributionInfoStyleLong;
for (NSUInteger styleValue = MGLAttributionInfoStyleLong; styleValue >= MGLAttributionInfoStyleShort; styleValue--) {
attributionInfoStyle = (MGLAttributionInfoStyle)styleValue;
CGSize attributionSize = [self attributionSizeWithLogoStyle:attributionInfoStyle sourceAttributionStyle:attributionInfoStyle];
if (attributionSize.width <= mglImage.size.width) {
break;
}
}
UIImage *logoImage = [self logoImageWithStyle:attributionInfoStyle];
CGSize attributionBackgroundSize = [self attributionTextSizeWithStyle:attributionInfoStyle];

Unsynchronized access to MGLMapSnapshotter from multiple threads is unsafe, and can cause memory corruption. We have received crash reports showing stack traces consistent with memory corruption in this code block. We shouldn't need to capture self at all here: we should re-factor the block to capture everything it needs by value.

There's another self capture in a dispatch_async to a user-provided queue here:

dispatch_async(queue, ^{
MGLMapSnapshot* snapshot = [[MGLMapSnapshot alloc] initWithImage:compositedImage scale:self.options.scale pointForFn:pointForFn];
completion(snapshot, nil);
});

This should be easy to clean up -- we just need to copy the scale and make sure the pointForFn function is copied cleanly.

When we take the snapshot, we capture self and call it in a block from a user-defined queue (we don't control which thread it runs on). We've seen a crash stack trace here that matches the scenario "_mbglMapSnapshotter reset on main thread while async block still depends on it":

dispatch_async(queue, ^{
_mbglMapSnapshotter->snapshot(_snapshotCallback->self());
});

The fix here is even easier: there's no reason we need to be using dispatch_async at all, since _mbglMapSnapshotter->snapshot is already asynchronous (it kicks off a render on a backgrpund thread, and it passes in an ActorRef for the callback, which is designed to tolerate the foreground resources being reset). I don't understand why this was put in a dispatch_async block in the first place -- I may be missing something, but my guess is that it was a misunderstanding of either how snapshot worked or how the callback worked.

Incidentally, the snapshot callback is set up to run on the main thread, which should make it OK to capture/use self (if the lifetime of the MGLMapSnapshotter ends, the callback will be cancelled and the weak self pointer won't be used; if it is used it will still be live and it will be used on the same thread):

_snapshotCallback = std::make_unique<mbgl::Actor<mbgl::MapSnapshotter::Callback>>(*mbgl::Scheduler::GetCurrent(), [=](std::exception_ptr mbglError, mbgl::PremultipliedImage image, mbgl::MapSnapshotter::Attributions attributions, mbgl::MapSnapshotter::PointForFn pointForFn) {

The important thing is that the user-provided completion callback is run on the user-provided queue -- which we're already doing.

With these changes, cancellation should leave the MGLMapSnapshotter in a valid state. I just noticed in the documentation that we say "Once you call this method, you cannot resume the snapshot. In order to obtain the snapshot, create a new MGLMapSnapshotter object." The "cannot resume" limitation will be gone -- if you want, you can re-start the snapshotter. It seems like a weird interface -- I'm not sure why you'd want to stop/restart a snapshot with the exact same options, and if you did I think the natural interface would be to destroy the snapshotter and then re-recreate it (which accomplishes the same thing). But I assume we'll want to maintain this interface since it's already public.

/cc @lilykaiser @julianrex @friedbunny @1ec5

@ChrisLoer
Copy link
Contributor Author

Oops forgot to tag @ivovandongen too.

@1ec5
Copy link
Contributor

1ec5 commented May 3, 2018

During the original implementation of this feature, we went back and forth about a) whether a single MGLMapSnapshotter can be used for multiple requests and b) whether those requests and happen in parallel.

The "cannot resume" limitation will be gone -- if you want, you can re-start the snapshotter.

This sounds fine to me, as long as we’re sure that the developer can safely restart the snapshotter on a different queue and/or with a different completion handler, as the -startWithQueue:completionHandler: method implies. I think we just copy-pasted MapKit’s limitation out of an abundance of caution. Removing this limitation would be a backwards-compatible change, so no problem there.

@ChrisLoer
Copy link
Contributor Author

As a mitigating factor, our docs state:

"You can change the snapshotter’s options at any time and reuse the snapshotter for multiple distinct snapshots; however, the snapshotter can only generate one snapshot at a time. If you need to generate multiple snapshots concurrently, create multiple snapshotter objects."

And also:

"Once you call cancel, you cannot resume the snapshot. In order to obtain the snapshot, create a new MGLMapSnapshotter object."

If developers follow both of those rules and the additional rule "don't touch MBGLMapSnapshotter.options while a snapshot is in progress", they'll probably avoid most of the race conditions, because no thread should be reading from the MGLMapSnapshotter object at the time another thread is writing to it.

Because the example just runs a snapshot once, and because that's a common case, it's reasonable to think that many users of the snapshotter won't be affected by these problems.

But...

1.) If a developer does re-use the snapshotter, this is a very easy trap to fall into (and the crashes are rare enough they probably wouldn't see them until production).
2.) I would not count on being able to safely use MGLMapSnapshotter from multiple threads just because no explicit writes are happening during the multi-threaded period. There's way too much shared code that may or may not be designed to be used concurrently (for example, a "read only" getter that used memoization could be enough to trigger memory corruption).

ChrisLoer added a commit that referenced this issue May 4, 2018
- Biggest change: when we apply the watermark on a background thread, don't capture self (turn most of the related instance methods into class methods)
- Don't call mbglMapSnapshotter->snapshot from a user-provided queue, since it's an asynchronous call anyway and starting it on the user's queue requires capturing self.
@ChrisLoer ChrisLoer self-assigned this May 4, 2018
ChrisLoer added a commit that referenced this issue May 29, 2018
- Biggest change: when we apply the watermark on a background thread, don't capture self (turn most of the related instance methods into class methods)
- Don't call mbglMapSnapshotter->snapshot from a user-provided queue, since it's an asynchronous call anyway and starting it on the user's queue requires capturing self.
ChrisLoer added a commit that referenced this issue Jun 1, 2018
- Biggest change: when we apply the watermark on a background thread, don't capture self (turn most of the related instance methods into class methods)
- Don't call mbglMapSnapshotter->snapshot from a user-provided queue, since it's an asynchronous call anyway and starting it on the user's queue requires capturing self.
ChrisLoer added a commit that referenced this issue Jun 4, 2018
- Biggest change: when we apply the watermark on a background thread, don't capture self (turn most of the related instance methods into class methods)
- Don't call mbglMapSnapshotter->snapshot from a user-provided queue, since it's an asynchronous call anyway and starting it on the user's queue requires capturing self.
@ChrisLoer
Copy link
Contributor Author

This should be fixed in master with #11831.

fabian-guerra pushed a commit that referenced this issue Jun 7, 2018
- Biggest change: when we apply the watermark on a background thread, don't capture self (turn most of the related instance methods into class methods)
- Don't call mbglMapSnapshotter->snapshot from a user-provided queue, since it's an asynchronous call anyway and starting it on the user's queue requires capturing self.

# Conflicts:
#	platform/darwin/src/MGLMapSnapshotter.mm
fabian-guerra pushed a commit that referenced this issue Jun 7, 2018
- Biggest change: when we apply the watermark on a background thread, don't capture self (turn most of the related instance methods into class methods)
- Don't call mbglMapSnapshotter->snapshot from a user-provided queue, since it's an asynchronous call anyway and starting it on the user's queue requires capturing self.

# Conflicts:
#	platform/darwin/src/MGLMapSnapshotter.mm
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants