-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add {into,from}_raw
to Rc and Arc
#37192
Conversation
r? @brson (rust_highfive has picked a reviewer for you, use r? to override) |
/// let x_ptr = Arc::into_raw(x); | ||
/// assert_eq!(unsafe { *x_ptr }, 10); | ||
/// ``` | ||
#[unstable(feature = "rc_raw", reason = "newly added", issue = "1")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason
field is no longer required, fyi.
This seems reasonable to me I think - cc @rust-lang/libs. |
193edbd
to
f9a1b95
Compare
// `value` field from the pointer. | ||
|
||
// Create a fake `ArcInner` to work with valid pointers only. | ||
let invalid_arc_inner: ArcInner<T> = mem::zeroed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps something like an offset_of!
macro could be used instead of an arbitrary value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that's a common pattern in C++, but isn't that UB? Just like ptr.offset()
is UB if going more than one past end of allocation? That's what the comment about valid pointers was meant to express. I could still wrap it in a macro though, but I'm not sure where it'd go if I wanted to share it between arc.rs
and rc.rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a safe version of offset_of!
which doesn't rely on UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Amanieu thanks for the link! That's pretty much what I'm doing here (bar the clever check against Deref
), do you think I should add your macro and use it for clarity and DRY though?
BTW: isn't this at risk of overflowing though: val.$field as *const _ as isize
. I do the casts to usize
, compute the difference and then cast to isize
, since I thought the absolute address could well be greater than isize::MAX
. Or is there some kind of guarantee that's not the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cristicbz nice catch, you're probably right about the overflow issue, I'll fix that.
The reason I used isize
is that I then pass the resulting value to .offset
to recover the address of the parent object. Using .offset
is better than performing arithmetic on usize
since it uses an LLVM GEP instruction, which helps with alias analysis. See the container_of!
macro later in the same file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I could cast to isize
and use ptr.offset(-offset)
, good point! My only other worry with your code (and the reason I used mem::zeroed
) is that, technically reads of uninitialized memory are UB as well. There's a lang lawyer question about whether &undef.field
is a read of undef
, but i wouldn't risk it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mem::zeroed
isn't much better than mem::uninitialized
since many types (eg &T
, Box<T>
) are not allowed to have a value of zero. In any case this is irrelevant since the whole macro is optimized down to a constant by the compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mem::zeroed isn't much better than mem::uninitialized since many types (eg &T, Box) are not allowed to have a value of zero.
My (limited) understanding is that an invalid value (such as a null reference which would be invalid to dereference) is different in kind to a value obtained via mem::uninitialized
which is undef
as far as LLVM concerned. But this is over my head.
In any case this is irrelevant since the whole macro is optimized down to a constant by the compiler.
Agreed.
Seems quite nifty to me! I like how it mirrors Let's see if this works... @rfcbot fcp merge |
Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Thumbs up from me - I already to this for Arc in RMBA and if this was in stdlib I would be less fragile w r t changes to Arc's internal layout. |
f9a1b95
to
b71a376
Compare
Thanks for the feedback @Amanieu & @alexcrichton! I moved the I did keep my casts to Didn't really know where to put these macros so I dropped them unexported in a |
pub unsafe fn from_raw(ptr: *const T) -> Self { | ||
// To find the corresponding pointer to the `RcBox` we need to subtract the offset of the | ||
// `value` field from the pointer. | ||
Rc { ptr: Shared::new(ptr.offset(-offset_of_unsafe!(RcBox<T>, value)) as *mut _) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to cast the pointer to u8
before applying the offset. See my container_of!
macro. You might want to add that macro to macros.rs
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh yes another reason i was doing usize arithmetic before!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(thanks for the heads up btw! :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Changed the name to unoffset_field_ptr
since outside of intrusive containers container_of
didn't seem that semantic for the operation.
b71a376
to
d2b4afc
Compare
This seems... overly complicated to me, but it's just an internal detail so seems fine. |
@alexcrichton Yeah it did get a bit out of hand... Could simplify down to an |
Perhaps yeah, I find |
@alexcrichton A C version of the macro that dereferences a NULL pointer is still UB in C. This is why gcc defines |
d2b4afc
to
72df9fc
Compare
I pushed a simplified version which defines only the |
// Private macro to get the offset of a struct field in bytes from the address of the struct. | ||
// | ||
// This macro is identical to `offset_of!` but doesn't give a warning about unnecessary unsafe | ||
// blocks when invoked from unsafe code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably fix this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, didn't do a last git add
to that file, thanks!
72df9fc
to
c2d5ea4
Compare
@rfcbot fcp concern Footgun I'm a little bit worried that this API is a footgun: the types and names strongly suggest that the only relevant raw representation is the At the very least, it's a violation of the usual convention for |
I agree with the change to |
I agree with @aturon and @CryZe that types compatible with When I first thought of this feature, in my mind the
|
@SimonSapin If I understand your suggestion correctly, exposing @aturon I can see the argument for the
All this said, I am sympathetic to the footgun argument. Just because we tell people not to pass an I'd suggest |
An
You could imagine that, yes. However, that would have a different representation - a representation is part of the concrete implementation of a type, not the abstract behavior (and you're arguing that the latter being the same makes the former irrelevant). |
Yes. I was thinking you might as well just stash an Rc in safe code and hold ownership on behalf of the C code, but I understand wanting to give a share of the ownership to C.
If you have access to Rust. You can't use that to let C manage reference counts, which is more what I was thinking. |
That would require a |
Is |
Or a pointer to the refcount. My objections here are pretty squishy, that's its slightly different than other things, in naming and in the magicy implementation. But I understand the use case is real and we should offer some solution. I'd like an answer on |
That may work for increasing the ref count (though I would rather not rely on random C code everywhere to correctly reproduce the atomic instruction necessary), but what about decreasing it? How do you invoke the destructor? (I just noticed that's also a problem for the " |
At least with the use case I had in mio, it was perf sensitive, so stashing the pointer in safe code would have required extra storage requirements where passing ownership to C didn't require any extra data structures. In that sense I do think this'd be possible for many cases, but perhaps not all.
Ah yeah that's true, but I'm not sure that C would ever expect to be able to do this? At least, I can't really think of a use case for that.
Hm yeah that's a good point, I might personally expect FWIW I slightly prefer the name |
24d9388
to
2669027
Compare
I changed this to My reasoning for
So do I, but I don't feel to strongly either---it was @aturon who requested it. |
I'm willing to be overruled on the name. Let's just get this thing landed. :) |
@brson, do you feel comfortable checking your box with an updated name? |
2669027
to
651cf58
Compare
{into,from}_inner_raw
to Rc and Arc{into,from}_raw
to Rc and Arc
I changed this back to |
@aturon @alexcrichton @brson ping? |
@cristicbz just waiting on @brson to check his box |
@bors r+ |
📌 Commit 651cf58 has been approved by |
Amazing, thanks for putting up with me and getting this merged! |
(also given the relative "controversy", I'd suggest relnotes may be appropriate to get another batch of input when TWiR comes around) |
Add `{into,from}_raw` to Rc and Arc These methods convert to and from a `*const T` for `Rc` and `Arc` similar to the way they work on `Box`. The only slight complication is that `from_raw` needs to offset the pointer back to find the beginning of the `RcBox`/`ArcInner`. I felt this is a fairly small addition, filling in a gap (when compared to `Box`) so it wouldn't need an RFC. The motivation is primarily for FFI. (I'll create an issue and update a PR with the issue number if reviewers agree with the change in principle **Edit: done #37197**) ~~Edit: This was initially `{into,from}_raw` but concerns were raised about the possible footgun if mixed with the methods of the same name of `Box`.~~ Edit: This was went from `{into,from}_raw` to `{into,from}_inner_raw` then back to `{into,from}_raw` during review.
No, thank you for sticking with it despite the lengthy process! Hope to see lots more to come. |
I find @cristicbz's reasoning for initially going with |
($container:path, $field:ident) => {{ | ||
// Make sure the field actually exists. This line ensures that a compile-time error is | ||
// generated if $field is accessed through a Deref impl. | ||
let $container { $field : _, .. }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could've used ref $field
to avoid using field access below (came here from the tracking issue which I found from the docs).
These methods convert to and from a
*const T
forRc
andArc
similar to the way they work onBox
. The only slight complication is thatfrom_raw
needs to offset the pointer back to find the beginning of theRcBox
/ArcInner
.I felt this is a fairly small addition, filling in a gap (when compared to
Box
) so it wouldn't need an RFC. The motivation is primarily for FFI.(I'll create an issue and update a PR with the issue number if reviewers agree with the change in principle Edit: done #37197)
Edit: This was initially{into,from}_raw
but concerns were raised about the possible footgun if mixed with the methods of the same name ofBox
.Edit: This was went from
{into,from}_raw
to{into,from}_inner_raw
then back to{into,from}_raw
during review.