diff --git a/src/type-layout.md b/src/type-layout.md index 718bb71d5..89b691575 100644 --- a/src/type-layout.md +++ b/src/type-layout.md @@ -149,7 +149,8 @@ layout such as reinterpreting values as a different type. Because of this dual purpose, it is possible to create types that are not useful for interfacing with the C programming language. -This representation can be applied to structs, unions, and enums. +This representation can be applied to structs, unions, and enums. The exception +is [zero-variant enumerations] for which the `C` representation is an error. #### \#[repr(C)] Structs @@ -222,9 +223,9 @@ assert_eq!(std::mem::size_of::(), 8); // Size of 6 from b, assert_eq!(std::mem::align_of::(), 4); // From a ``` -#### \#[repr(C)] Enums +#### \#[repr(C)] Field-less Enums -For [C-like enumerations], the `C` representation has the size and alignment of +For [field-less enums], the `C` representation has the size and alignment of the default `enum` size and alignment for the target platform's C ABI. > Note: The enum representation in C is implementation defined, so this is @@ -232,38 +233,168 @@ the default `enum` size and alignment for the target platform's C ABI. > of interest is compiled with certain flags. > Warning: There are crucial differences between an `enum` in the C language and -> Rust's C-like enumerations with this representation. An `enum` in C is +> Rust's field-less enumerations with this representation. An `enum` in C is > mostly a `typedef` plus some named constants; in other words, an object of an > `enum` type can hold any integer value. For example, this is often used for -> bitflags in `C`. In contrast, Rust’s C-like enumerations can only legally hold -> the discrimnant values, everything else is undefined behaviour. Therefore, -> using a C-like enumeration in FFI to model a C `enum` is often wrong. +> bitflags in `C`. In contrast, Rust’s field-less enums can only legally hold +> the discrimnant values, everything else is [undefined behavior]. Therefore, +> using a field-less enum in FFI to model a C `enum` is often wrong. -It is an error for [zero-variant enumerations] to have the `C` representation. +#### \#[repr(C)] Enums With Fields -For all other enumerations, the layout is unspecified. +For enums with fields, the `C` representation is defined to be the same as the +follow types. These types don't actually exist, so the names are only here to +help describe relationships. All of these type have the `C` representation. -Likewise, combining the `C` representation with a primitive representation, the -layout is unspecified. +The enums with fields with the `C` representation, the represented enum, has +the same representation of a a struct two fields, the tagged union. The first +field of the tagged union is a field-less enum, the discriminant enum. The +second field of the tagged union is a union, the fields union. -### Primitive representations +The discrimiant enum has one variant for each variant in the represented enum +and are ordered in the same way as in the represented enum. + +The fields union consists of fields corresponding to each variant in the +represented enum. Each field contains the fields from the corresponding variant +in the order defined in the variant. The valid field in the union is the one +that corresponds to the same variant that the discriminant enum's value +corresponds with. + +```rust +// This Enum has the same layout as +#[repr(C)] +enum RepresentedEnum { + A(u32), + B(f32, u64), + C { x: u32, y: u8 }, + D, +} + +// this struct. +#[repr(C)] +struct TaggedUnion { + tag: DiscriminantEnum, + payload: FieldsUnion, +} + +// This is the discriminant enum. +#[repr(C)] +enum DiscriminantEnum { A, B, C, D } + +// This is the variant union. +#[repr(C)] +union FieldsUnion { + A: FieldsA, + B: FieldsB, + C: FieldsC, + D: FieldsD, +} + +#[repr(C)] +struct FieldsA(u32); + +#[repr(C)] +struct FieldsB(f32, u64); + +#[repr(C)] +struct FieldsC { x: u32, y: u8 } + +#[repr(C)] +struct FieldsD; +``` + +Combining the `C` representation and a +primitive representation is only defined for enums with fields. The primitive +representation modifies the `C` representation by changing the representation of +the discriminant enum to have the representation of the chosen primitive +representation. So, if you chose the `u8` representation, then the discriminant +enum would have a size and alignment of 1 byte. + +> Note: This representation was designed for primarily interfacing with C code +> that already exists matching a common way Rust's enums are implemented in +> C. If you have control over both the Rust and C code, such as using C as FFI +> glue between Rust and some third language, then you should use a +> [primitive representation](#primitive-representation-of-enums-with-fields) +> instead. + +### Primitive Representations The *primitive representations* are the representations with the same names as the primitive integer types. That is: `u8`, `u16`, `u32`, `u64`, `usize`, `i8`, `i16`, `i32`, `i64`, and `isize`. -Primitive representations can only be applied to enumerations. +Primitive representations can only be applied to enumerations, and have +different behavior whether the enum has fields or no fields. It is an error +for [zero-variant enumerations] to have a primitive representation. + +Combining two primitive representations together is unspecified. + +Combining the `C` representation and a primitive representation is described +[above](#c-primitive-representation). + +#### Primitive Representation of Field-less Enums + +For [field-less enums], they set the size and alignment to be the same as +the primitive type of the same name. For example, a field-less enum with +a `u8` representation can only have discriminants between 0 and 255 inclusive. -For [C-like enumerations], they set the size and alignment to be the same as the -primitive type of the same name. For example, a C-like enumeration with a `u8` -representation can only have discriminants between 0 and 255 inclusive. +#### Primitive Representation of Enums With Fields -It is an error for [zero-variant enumerations] to have a primitive -representation. +For enums with fields, the enum will have the same type layout a union with the +`C` representation that's fields consist of structs with the `C` representation +corresponding to each variant in the enum. The first field in each struct is +the same field-less enum with the same primitive representation that is +the enum with all fields in its variants removed and the rest of the fields +consisting of the fields of the corresponding variant in the order defined in +original enumeration. -For all other enumerations, the layout is unspecified. +Because unions with non-copy fields aren't allowed, this representation can only +be used if every field is also [`Copy`]. -Likewise, combining two primitive representations together is unspecified. +> Note: This is commonly different than what is done in C and C++. Projects in +> those languages often use a tuple of `(enum, payload)`. For making your enum +> represented like that, use the `C` representation. + +```rust +// This custom enum +#[repr(u8)] +enum MyEnum { + A(u32), + B(f32, u64), + C { x: u32, y: u8 }, + D, +} + +// has the same type layout as this union +#[repr(C)] +#[derive(Clone, Copy)] +union MyEnumRepr { + A: MyEnumVariantA, + B: MyEnumVariantB, + C: MyEnumVariantC, + D: MyEnumVariantD, +} + +#[repr(u8)] +#[derive(Clone, Copy)] +enum MyEnumDiscriminant { A, B, C, D } + +#[repr(C)] +#[derive(Clone, Copy)] +struct MyEnumVariantA(MyEnumDiscriminant, u32); + +#[repr(C)] +#[derive(Clone, Copy)] +struct MyEnumVariantB(MyEnumDiscriminant, f32, u64); + +#[repr(C)] +#[derive(Clone, Copy)] +struct MyEnumVariantC { tag: MyEnumDiscriminant, x: u32, y: u8 } + +#[repr(C)] +#[derive(Clone, Copy)] +struct MyEnumVariantD(MyEnumDiscriminant); +``` ### The `align` Representation @@ -288,7 +419,7 @@ padding bytes and forcing the alignment of the type to `1`. The `align` and `packed` representations cannot be applied on the same type and a `packed` type cannot transitively contain another `align`ed type. -> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and it is +> Warning: Dereferencing an unaligned pointer is [undefined behavior] and it is > possible to [safely create unaligned pointers to `packed` fields][27060]. > Like all ways to create undefined behavior in safe Rust, this is a bug. @@ -298,7 +429,9 @@ a `packed` type cannot transitively contain another `align`ed type. [`size_of`]: ../std/mem/fn.size_of.html [`Sized`]: ../std/marker/trait.Sized.html [dynamically sized types]: dynamically-sized-types.html -[C-like enumerations]: items/enumerations.html#custom-discriminant-values-for-field-less-enumerations +[field-less enums]: items/enumerations.html#custom-discriminant-values-for-field-less-enumerations [zero-variant enumerations]: items/enumerations.html#zero-variant-enums [undefined behavior]: behavior-considered-undefined.html [27060]: https://github.com/rust-lang/rust/issues/27060 +[primitive representation]: #primitive-representations +[`Copy`]: special-types-and-traits.html#copy \ No newline at end of file