Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgraded Timer API and fix Timer tests... #323

Merged
merged 6 commits into from
Feb 6, 2015

Conversation

drbawb
Copy link
Contributor

@drbawb drbawb commented Feb 5, 2015

This PR is based on the feedback from issue #320 ... it introduces a more Rust-like interface for the Timer wrapper that is much less error-prone, and also memory safe!

To summarize how it works:

  • The user passes a boxed closure to the timer, e.g: let t = Timer::new(Box::new(move|| { ... }));
  • The timer runs until that closure returns a 0-interval OR ...
  • The timer is forcibly cancelled when it is dropped, this is because the closure will go out of scope, so further
    invocations would point to freed memory.

This also improves thread safety over the previous API(s), from the SDL2 documentation:

Use this function to set up a callback function to be run on a separate thread after the specified number of milliseconds has elapsed.

Since the callback might be run on a separate thread: the closure's environment must satisfy Send so that it can be moved to the thread the timer will run on.

Knowing this: it would also be unsafe to allow the closure to freely mutate its environment, so the closure does not allow for a mutable environment. In practice this simply means you have to protect mutable data w/ a Mutex or RwLock, and shared data w/ an Arc.


The tests have been improved so they hopefully don't fail spuriously anymore. I accomplished this by simply adding an upper-bound to the timer's callback so it stops trying to acquire the mutex after a certain number of iterations.


The whole thing is marked unstable for a number of reasons:

  1. ... well, we've changed it 3 times now, so it really isn't stable
  2. Since we're forcing the caller to wrap the closure in a Box<T> we may need want to change the API based on how the placement-new syntax (#![feature(box)]) pans out in rust-1.0.0
  3. We may want some sugar so the caller doesn't have to wrap the closure in a Box<T> themselves?

The timer now takes a `Box<F>` where F implements a closure w/ the
signature `Fn() -> u32+Send`

The timer starts as soon as it is created, and ends under the following
conditions:

* The callback, when invoked, returns a non-positive continuation interval
* The user drops the `Timer`, either by calling `drop(timer)` or by
  letting it go out of scope.

The timer is cancelled when exiting scope to ensure memory-safety. Since
the timer owns the closure's environment: the closure cannot safely be
called once the timer itself has been freed.

---

The `Send` bound implies that closures must be able to move to another
thread. This is because SDL does not guarantee that timers will be run
on the current thread.

This means mutable access to data will have to be done in a thread-safe
manner using the primitives in `std::sync`.

---

* The tests have been updated to reflect the new API.

* They have also been made more robust to intermittent failure. The
offending test now has an upper limit, after which it will cancel the
timer.

* This is meant to ensure the mutex will be available for locking
when the test checks the exit condition.
1. Remove unnecessary return
2. Use `u32` instead of `usize`
@drbawb drbawb force-pushed the feature/rust-100-timers branch from 4df1e69 to e5aa71d Compare February 5, 2015 22:17
@nukep
Copy link
Contributor

nukep commented Feb 5, 2015

I would only make two changes:

  • Mark the closure as Sync instead of Send
  • Use FnMut() -> u32

I think Send means that ownership of the type can be transferred between threads. Because we never destroy the closure on the timer thread, it's not necessary.

Sync means that a type can be used across different threads without the risk of data races. All immutable types in Rust without side effects should implement Sync anyway (it good practice), so I don't think we're imposing any unnecessary restrictions.

Which leads to FnMut: if it's bound with Sync, then by definition we won't get data races.

I've said contradicting things in #312, but I didn't fully understand Send vs Sync back then ("back then" being 5 days ago).

@nukep
Copy link
Contributor

nukep commented Feb 5, 2015

I didn't catch it before, but why is there a generic type F on the entire Timer?

@drbawb
Copy link
Contributor Author

drbawb commented Feb 5, 2015

I was going off of the Concurrency chapter of the Rust guide which has the following to say about Sync (emphasis mine):

The second of these two trait is called Sync. When a type T implements Sync, it indicates to the compiler that something of this type has no possibility of introducing memory unsafety when used from multiple threads concurrently.

The closure has a single owner (Timer) and is only being accessed from a single thread (SDL).


As for FnMut though, I do agree that this now makes sense. The Send (or Sync) bound require you to use a closure that owns its environment (move || {}). Since the closure has a single well-defined owner now: I agree that it should be allowed to mutate the environment it owns.

EDIT: Note that std::thread::Thread#spawn() only implies the Send bound, and I think this use-case is somewhat similar to that.

@drbawb
Copy link
Contributor Author

drbawb commented Feb 5, 2015

I didn't catch it before, but why is there a generic type F on the entire Timer?

The timer owns the callback (Option<Box<F>>), where F is the type of the closure.
This makes sure the closure's environment is alive on the heap for the duration of the timer, and it is freed when the Timer is dropped.

EDIT: The Box is necessary for liveness, the Option is used to help in the impl. of into_inner() w/o the Timer erroneously dropping the closure.

@nukep
Copy link
Contributor

nukep commented Feb 5, 2015

Consider this scenario: The closure captures an Arc<T> reference from its environment.

Even though the closure itself is not being called simultaneously across multiple threads, the values within could be. In the case of Arc<T>, shared references to the same value could be had on the timer thread, as well as other threads.

@nukep
Copy link
Contributor

nukep commented Feb 5, 2015

I think the F generic type is superfluous; you should be able to have Box<FnMut() -> u32 + Sync> and have it work just fine. The closure environment should be able to move with the Box: it's how boxed closures worked before the unboxed closure reform.

A generic type for Timer also means we can't create a polymorphic list of timers:

let timers: Vec<Timer>;

@drbawb
Copy link
Contributor Author

drbawb commented Feb 5, 2015

I think the F generic type is superfluous; you should be able to have Box<FnMut() -> u32 + Sync> and have it work just fine. The closure environment should be able to move with the Box: it's how boxed closures worked before the unboxed closure reform.

This doesn't work because Box<FnMut() -> u32 + Sync> is a trait object, it is two pointers onto the heap not one pointer onto the heap.

There is no way to pass two pointers to SDL2's timer callback. You also can't pass a pointer to the trait-object because the trait-object moves wherever the Timer does. (Which means you need to go back to having a Guard that prevents the Timer from moving while active; or you need a way to fix-up the SDL callback every time the Timer moves.)


Consider this scenario: The closure captures an Arc reference from its environment.

The Send bound is already sufficient enough to prevent that. Arc<T>is only Send when T itself is Send+Sync:

impl<T> Send for Arc<T> where T: Send, T: Sync {}

You can't put non-Sync things in an Arc and still Send it across a thread, you need to wrap it w/ a lock to do that (RwLock or Mutex) which impl Sync on behalf of T because they manage the interior mutability in a thread-safe way (with mutexes.)

The entire environment of the closure doesn't need to be Sync, as the closure owns the only things it can mutate, so there are not two threads competing for access... and you can't Send things w/ interior mutability that aren't explicitly marked Sync or wrapped w/ a type that implements Sync.

It is my understanding that those types which provide safe concurrent access are supposed to be the primary users of the Sync trait.

@nukep
Copy link
Contributor

nukep commented Feb 5, 2015

The Arc<T> example was more to illustrate that multi-thread access could still happen with Timer. Send by itself will not prevent data-race-y types from being used from other threads, it just so happens that Arc<T> also implements Send.

It's possible to have types that are !Send and Sync (not safe to move or destroy on another thread, but safe to use lock-free from other threads). If we bound the closure with Send, we're preventing such a type from being used.

@nukep
Copy link
Contributor

nukep commented Feb 5, 2015

On second thought, I might be wrong about Send. My head is spinning.

Why are threads so confusing?

@nukep
Copy link
Contributor

nukep commented Feb 5, 2015

Usually, closures capture environment variables by reference. A &T is Send as long as T is Sync IIRC. That's where my confusion came from.

But fundamentally, even with a move, the closure variables are still used by reference (param in the C bridge is a pointer, after all). The variables are still wholly owned by the Timer, so we shoudn't enforce Send.

@nukep
Copy link
Contributor

nukep commented Feb 6, 2015

I'm unsure, but this might solve the problem with transmuting the Box: http://doc.rust-lang.org/std/raw/struct.Closure.html

code and env could be used for the function pointer and data pointer respectively.

EDIT: Nope: the box is just a plain old trait object. std::raw::Closure is slated to be removed soon. Arrg.

@drbawb
Copy link
Contributor Author

drbawb commented Feb 6, 2015

You're right in saying the closure's environment never moves. It is always in the same spot once Timer::new() is called.

However I don't think dropping the Send bound is correct: dropping the Send bound means you no longer have to move the closure, which means the callback is referencing a stack which may have already been destroyed since the Timer is allowed to move.

  • Bounding it by the lifetime of the Timer is not correct. The timer could be returned from a function: invalidating the closure's references, even though the Timer is still alive.
  • Bounding it by 'static seems unnecessarily restrictive.

A &T is Send as long as T is Sync IIRC. That's where my confusion came from.

Send is simply defined as:

Types able to be transferred across thread boundaries.

I believe you might be thinking of this line from the Sync docs instead?

The precise definition is: a type T is Sync if &T is thread-safe. In other words, there is no possibility of data races when passing &T references between threads.


I'm unsure, but this might solve the problem with transmuting the Box: http://doc.rust-lang.org/std/raw/struct.Closure.html

code and env could be used for the function pointer and data pointer respectively.

It doesn't unfortunately. (I already tried it.)

You can't pass the whole thing to SDL by value (it doesn't fit in 1-word.) If you pass it by reference: that reference is invalidated whenever the Timer (which owns the closure) moves. It's the exact same problem as passing the trait object unfortunately.

In order to use closures w/o the generic bound you would need to store them in statically allocated storage, which adds a bunch of allocation & book-keeping. Then the Timer wouldn't need to own the closure, it would simply have some sort of reference to it when it needs to be freed or reclaimed.

@drbawb
Copy link
Contributor Author

drbawb commented Feb 6, 2015

Again you can make that sort of API work, but it's cumbersome to use. The closure either has to outlive the timer, which greatly complicates how you can store the Timer anyways, or you need RAII guards so the timer stops when it is moved.

@drbawb
Copy link
Contributor Author

drbawb commented Feb 6, 2015

Here is a concrete example of why you need a guard for that sort of API, and also how cumbersome it is to use.

  1. The caller has to cast the closure to a trait-object, and then provide a reference to that. Very ugly and unwieldy syntax.
  2. Even though the reference to the closure is supposed to live as long as the Timer (&'a &Fn() -> u32) this code still compiles and segfaults. AFAICT this is because the reference only has to be valid for the call to ::new() (which is not true.) Hence why you need the Timer to own the closure and provide a RAII guard instead.

@drbawb drbawb force-pushed the feature/rust-100-timers branch from 819cb61 to 6ed205a Compare February 6, 2015 01:31
The trait object itself is stored on the heap, a 1-word pointer to this
trait object is then passed to SDL and used to fire the callback.

Implies the Sync+'static bound to guarantee that the closures
envrionment can be invoked safely from another thread.
@drbawb drbawb force-pushed the feature/rust-100-timers branch from 6ed205a to 10c2a82 Compare February 6, 2015 01:35
@drbawb
Copy link
Contributor Author

drbawb commented Feb 6, 2015

How about this? It gets rid of the generic parameter F at the cost of two layers of indirection.

What we have here is Pointer -> TraitObject { data: Pointer, vtable: Pointer }

  • Indirection #1: the function itself is a trait-object which implies dynamic dispatch
  • Indirection #2: Then in order to pass it to SDL we have to put that trait-object on the heap.

That being said: it passes my existing tests ... and there should be no reason you can't store it in a heterogeneous collection of timers.


I'm using Sync+'static as the bound currently. I definitely want to keep the 'static bound as it forces them to move the closure. It's still not clear to me that Sync is the correct bound, but I also can't think of a single use case where the bound gets in my way...

Also if we use a non-'static bound I'm not sure how to write into_inner() because it would have to return something like Box<FnMut() -> u32+'a> for Timer<'a> where the Timer<'a> is about to be dropped.

Sync+'a seems to work just fine and doesn't require the move keyword.

@nukep
Copy link
Contributor

nukep commented Feb 6, 2015

Sorry for being so annoying, BTW. To be fair, we're dealing with both concurrent programming and C interop. 😵

It's annoying that double indirection is necessary to keep the generic F parameter away (it's triple indirection if we include the function call!). You'd think that Rust would have a way to extract the call_mut pointer from the trait object's vtable, but guess not...

@drbawb
Copy link
Contributor Author

drbawb commented Feb 6, 2015

Sadly I can't think of any other way to do it, we're very limited by the SDL API here. We have to cram our callback into one pointer. (Plus that pointer needs to live long enough if we want memory safety!)

I am liking the current API though. It requires the Timer to have a lifetime parameter, but what's cool is you can just store a Timer<'static> so long as you move|| {} the closure. So it doesn't force you to add a lifetime parameter to the owner of the Timer.

(... and if you try to store a Timer<'static> without the closure owning its environment you get a compile time error.)


As for the Sync bound I think I'm starting to come around to the idea that it's required if we want to allow FnMut()s. Since you could consider their environment to be a mutable interior, despite being stored in an immutable slot (the Timer.)

Regardless we can always relax the trait bounds w/o breaking existing code, tightening the bounds is much more difficult.

@AngryLawyer
Copy link
Member

You're an absolute star

AngryLawyer added a commit that referenced this pull request Feb 6, 2015
Upgraded `Timer` API and fix `Timer` tests...
@AngryLawyer AngryLawyer merged commit efa23dc into Rust-SDL2:master Feb 6, 2015
@drbawb drbawb mentioned this pull request Feb 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants