From ad4342b0e2c9efe853b4d412f24f50fd84f95618 Mon Sep 17 00:00:00 2001 From: Havvy Date: Sat, 17 Feb 2018 12:18:59 -0800 Subject: [PATCH 1/3] Document new type representations --- src/type-layout.md | 119 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 105 insertions(+), 14 deletions(-) diff --git a/src/type-layout.md b/src/type-layout.md index 718bb71d5..acd3d5af0 100644 --- a/src/type-layout.md +++ b/src/type-layout.md @@ -224,7 +224,7 @@ assert_eq!(std::mem::align_of::(), 4); // From a #### \#[repr(C)] Enums -For [C-like enumerations], the `C` representation has the size and alignment of +For [field-less enums], the `C` representation has the size and alignment of the default `enum` size and alignment for the target platform's C ABI. > Note: The enum representation in C is implementation defined, so this is @@ -232,19 +232,60 @@ the default `enum` size and alignment for the target platform's C ABI. > of interest is compiled with certain flags. > Warning: There are crucial differences between an `enum` in the C language and -> Rust's C-like enumerations with this representation. An `enum` in C is +> Rust's field-less enumerations with this representation. An `enum` in C is > mostly a `typedef` plus some named constants; in other words, an object of an > `enum` type can hold any integer value. For example, this is often used for -> bitflags in `C`. In contrast, Rust’s C-like enumerations can only legally hold +> bitflags in `C`. In contrast, Rust’s field-less enums can only legally hold > the discrimnant values, everything else is undefined behaviour. Therefore, -> using a C-like enumeration in FFI to model a C `enum` is often wrong. +> using a field-less enum in FFI to model a C `enum` is often wrong. -It is an error for [zero-variant enumerations] to have the `C` representation. +For enums with fields, the `C` representation has the same representation as +it would with the [primitive representation] with the field-less enum in its +description having the `C` representation. -For all other enumerations, the layout is unspecified. +```rust +// This Enum has the same layout as +#[repr(C)] +enum MyEnum { + A(u32), + B(f32, u64), + C { x: u32, y: u8 }, + D, +} -Likewise, combining the `C` representation with a primitive representation, the -layout is unspecified. +// this struct. +#[repr(C)] +struct MyEnumRepr { + tag: MyEnumTag, + payload: MyEnumPayload, +} + +#[repr(C)] +enum MyEnumTag { A, B, C, D } + +#[repr(C)] +union MyEnumPayload { + A: u32, + B: MyEnumPayloadB, + C: MyEnumPayloadC, + D: (), +} + +#[repr(C)] +struct MyEnumPayloadB(f32, u64); + +#[repr(C)] +struct MyEnumPayloadC { x: u32, y: u8 } +``` + +It is an error for [zero-variant enumerations] to have the `C` representation. + +Combining the `C` representation and a +primitive representation is only defined for enums with fields and it changes +the representation of the tag, e.g. `MyEnumTag` in the previous example, to have +the representation of the chosen primitive representation. So, if you chose the +`u8` representation, then the tag would have a size and alignment of 1 byte. + ### Primitive representations @@ -254,16 +295,65 @@ the primitive integer types. That is: `u8`, `u16`, `u32`, `u64`, `usize`, `i8`, Primitive representations can only be applied to enumerations. -For [C-like enumerations], they set the size and alignment to be the same as the -primitive type of the same name. For example, a C-like enumeration with a `u8` -representation can only have discriminants between 0 and 255 inclusive. +For [field-less enums], they set the size and alignment to be the same as +the primitive type of the same name. For example, a field-less enum with +a `u8` representation can only have discriminants between 0 and 255 inclusive. + +For enums with fields, the enum will have the same type layout a union with the +`C` representation that's fields consist of structs with the `C` representation +corresponding to each variant in the enum. The first field in each struct is +the same field-less enum with the same primitive representation that is +the enum with all fields in its variants removed and the rest of the fields +consisting of the fields of the corresponding variant in the order defined in +original enumeration. + +> Note: This is commonly different than what is done in C and C++. Projects in +> those languages often use a tuple of `(enum, payload)`. For making your enum +> represented like that, see [the tagged union representation] below. + +```rust +// This custom enum +#[repr(u8)] +enum MyEnum { + A(u32), + B(f32, u64), + C { x: u32, y: u8 }, + D, +} + +// has the same type layout as this union +#[repr(C)] +union MyEnumRepr { + A: MyEnumVariantA, + B: MyEnumVariantB, + C: MyEnumVariantC, + D: MyEnumVariantD, +} + +#[repr(u8)] +enum MyEnumDiscriminant { A, B, C, D } + +#[repr(C)] +struct MyEnumVariantA(MyEnumDiscriminant, u32); + +#[repr(C)] +struct MyEnumVariantB(MyEnumDiscriminant, f32, u64); + +#[repr(C)] +struct MyEnumVariantC { tag: MyEnumDiscriminant, x: u32, y: u8 } + +#[repr(C)] +struct MyEnumVariantD(MyEnumDiscriminant); +``` It is an error for [zero-variant enumerations] to have a primitive representation. -For all other enumerations, the layout is unspecified. +Combining two primitive representations together is unspecified. + +Combining the `C` representation and a primitive representation is described +[above][#c-primitive-representation]. -Likewise, combining two primitive representations together is unspecified. ### The `align` Representation @@ -298,7 +388,8 @@ a `packed` type cannot transitively contain another `align`ed type. [`size_of`]: ../std/mem/fn.size_of.html [`Sized`]: ../std/marker/trait.Sized.html [dynamically sized types]: dynamically-sized-types.html -[C-like enumerations]: items/enumerations.html#custom-discriminant-values-for-field-less-enumerations +[field-less enums]: items/enumerations.html#custom-discriminant-values-for-field-less-enumerations [zero-variant enumerations]: items/enumerations.html#zero-variant-enums [undefined behavior]: behavior-considered-undefined.html [27060]: https://github.com/rust-lang/rust/issues/27060 +[primitive representation]: #primitive-representations \ No newline at end of file From e8aabe8d340390042d33bf28d6798213656ca991 Mon Sep 17 00:00:00 2001 From: Havvy Date: Sun, 18 Feb 2018 01:23:46 -0800 Subject: [PATCH 2/3] Organization of reprs --- src/type-layout.md | 83 +++++++++++++++++++++++++++++++--------------- 1 file changed, 56 insertions(+), 27 deletions(-) diff --git a/src/type-layout.md b/src/type-layout.md index acd3d5af0..5f7d817f9 100644 --- a/src/type-layout.md +++ b/src/type-layout.md @@ -149,7 +149,8 @@ layout such as reinterpreting values as a different type. Because of this dual purpose, it is possible to create types that are not useful for interfacing with the C programming language. -This representation can be applied to structs, unions, and enums. +This representation can be applied to structs, unions, and enums. The exception +is [zero-variant enumerations] for which the `C` representation is an error. #### \#[repr(C)] Structs @@ -222,7 +223,7 @@ assert_eq!(std::mem::size_of::(), 8); // Size of 6 from b, assert_eq!(std::mem::align_of::(), 4); // From a ``` -#### \#[repr(C)] Enums +#### \#[repr(C)] Field-less Enums For [field-less enums], the `C` representation has the size and alignment of the default `enum` size and alignment for the target platform's C ABI. @@ -236,12 +237,21 @@ the default `enum` size and alignment for the target platform's C ABI. > mostly a `typedef` plus some named constants; in other words, an object of an > `enum` type can hold any integer value. For example, this is often used for > bitflags in `C`. In contrast, Rust’s field-less enums can only legally hold -> the discrimnant values, everything else is undefined behaviour. Therefore, +> the discrimnant values, everything else is [undefined behavior]. Therefore, > using a field-less enum in FFI to model a C `enum` is often wrong. -For enums with fields, the `C` representation has the same representation as -it would with the [primitive representation] with the field-less enum in its -description having the `C` representation. +#### \#[repr(C)] Enums With Fields + +For enums with fields, the `C` representation is a struct with representation +`C` of two fields where the first field is a field-less enum with the `C` +representation that has one variant for each variant in the enum with fields +and the second field a union with the `C` representation that's fields consist +of structs with the `C` representation corresponding to each variant in the +enum. Each struct consists of the fields from the corresponding variant in the +order defined in the enum with fields. + +Because unions with non-copy fields aren't allowed, this representation can only +be used if every field is also [`Copy`]. ```rust // This Enum has the same layout as @@ -272,33 +282,51 @@ union MyEnumPayload { } #[repr(C)] +#[derive(Clone, Copy)] struct MyEnumPayloadB(f32, u64); #[repr(C)] +#[derive(Clone, Copy)] struct MyEnumPayloadC { x: u32, y: u8 } ``` -It is an error for [zero-variant enumerations] to have the `C` representation. - Combining the `C` representation and a -primitive representation is only defined for enums with fields and it changes -the representation of the tag, e.g. `MyEnumTag` in the previous example, to have -the representation of the chosen primitive representation. So, if you chose the -`u8` representation, then the tag would have a size and alignment of 1 byte. - +primitive representation is only defined for enums with fields. The primitive +representation modifies the `C` representation by changing the representation of +the tag, e.g. `MyEnumTag` in the previous example, to have the representation of +the chosen primitive representation. So, if you chose the `u8` representation, +then the tag would have a size and alignment of 1 byte. -### Primitive representations +> Note: This representation was designed for primarily interfacing with C code +> that already exists matching a common way Rust's enums are implemented in +> C. If you have control over both the Rust and C code, such as using C as FFI +> glue between Rust and some third language, then you should use a +> [primitive representation](#primitive-representation-of-enums-with-fields) +> instead. + +### Primitive Representations The *primitive representations* are the representations with the same names as the primitive integer types. That is: `u8`, `u16`, `u32`, `u64`, `usize`, `i8`, `i16`, `i32`, `i64`, and `isize`. -Primitive representations can only be applied to enumerations. +Primitive representations can only be applied to enumerations, and have +different behavior whether the enum has fields or no fields. It is an error +for [zero-variant enumerations] to have a primitive representation. + +Combining two primitive representations together is unspecified. + +Combining the `C` representation and a primitive representation is described +[above](#c-primitive-representation). + +#### Primitive Fepresentation of Field-less Enums For [field-less enums], they set the size and alignment to be the same as the primitive type of the same name. For example, a field-less enum with a `u8` representation can only have discriminants between 0 and 255 inclusive. +#### Primitive Representation of Enums With Fields + For enums with fields, the enum will have the same type layout a union with the `C` representation that's fields consist of structs with the `C` representation corresponding to each variant in the enum. The first field in each struct is @@ -307,9 +335,12 @@ the enum with all fields in its variants removed and the rest of the fields consisting of the fields of the corresponding variant in the order defined in original enumeration. +Because unions with non-copy fields aren't allowed, this representation can only +be used if every field is also [`Copy`]. + > Note: This is commonly different than what is done in C and C++. Projects in > those languages often use a tuple of `(enum, payload)`. For making your enum -> represented like that, see [the tagged union representation] below. +> represented like that, use the `C` representation. ```rust // This custom enum @@ -323,6 +354,7 @@ enum MyEnum { // has the same type layout as this union #[repr(C)] +#[derive(Clone, Copy)] union MyEnumRepr { A: MyEnumVariantA, B: MyEnumVariantB, @@ -331,30 +363,26 @@ union MyEnumRepr { } #[repr(u8)] +#[derive(Clone, Copy)] enum MyEnumDiscriminant { A, B, C, D } #[repr(C)] +#[derive(Clone, Copy)] struct MyEnumVariantA(MyEnumDiscriminant, u32); #[repr(C)] +#[derive(Clone, Copy)] struct MyEnumVariantB(MyEnumDiscriminant, f32, u64); #[repr(C)] +#[derive(Clone, Copy)] struct MyEnumVariantC { tag: MyEnumDiscriminant, x: u32, y: u8 } #[repr(C)] +#[derive(Clone, Copy)] struct MyEnumVariantD(MyEnumDiscriminant); ``` -It is an error for [zero-variant enumerations] to have a primitive -representation. - -Combining two primitive representations together is unspecified. - -Combining the `C` representation and a primitive representation is described -[above][#c-primitive-representation]. - - ### The `align` Representation The `align` representation can be used on `struct`s and `union`s to raise the @@ -378,7 +406,7 @@ padding bytes and forcing the alignment of the type to `1`. The `align` and `packed` representations cannot be applied on the same type and a `packed` type cannot transitively contain another `align`ed type. -> Warning: Dereferencing an unaligned pointer is [undefined behaviour] and it is +> Warning: Dereferencing an unaligned pointer is [undefined behavior] and it is > possible to [safely create unaligned pointers to `packed` fields][27060]. > Like all ways to create undefined behavior in safe Rust, this is a bug. @@ -392,4 +420,5 @@ a `packed` type cannot transitively contain another `align`ed type. [zero-variant enumerations]: items/enumerations.html#zero-variant-enums [undefined behavior]: behavior-considered-undefined.html [27060]: https://github.com/rust-lang/rust/issues/27060 -[primitive representation]: #primitive-representations \ No newline at end of file +[primitive representation]: #primitive-representations +[`Copy`]: special-types-and-traits.html#copy \ No newline at end of file From c883ab21a75cf85ff3206a76ed11abcd806fa566 Mon Sep 17 00:00:00 2001 From: Havvy Date: Mon, 19 Feb 2018 13:14:55 -0800 Subject: [PATCH 3/3] WIP Address alercah's concerns --- src/type-layout.md | 67 +++++++++++++++++++++++++++------------------- 1 file changed, 40 insertions(+), 27 deletions(-) diff --git a/src/type-layout.md b/src/type-layout.md index 5f7d817f9..89b691575 100644 --- a/src/type-layout.md +++ b/src/type-layout.md @@ -242,21 +242,28 @@ the default `enum` size and alignment for the target platform's C ABI. #### \#[repr(C)] Enums With Fields -For enums with fields, the `C` representation is a struct with representation -`C` of two fields where the first field is a field-less enum with the `C` -representation that has one variant for each variant in the enum with fields -and the second field a union with the `C` representation that's fields consist -of structs with the `C` representation corresponding to each variant in the -enum. Each struct consists of the fields from the corresponding variant in the -order defined in the enum with fields. +For enums with fields, the `C` representation is defined to be the same as the +follow types. These types don't actually exist, so the names are only here to +help describe relationships. All of these type have the `C` representation. -Because unions with non-copy fields aren't allowed, this representation can only -be used if every field is also [`Copy`]. +The enums with fields with the `C` representation, the represented enum, has +the same representation of a a struct two fields, the tagged union. The first +field of the tagged union is a field-less enum, the discriminant enum. The +second field of the tagged union is a union, the fields union. + +The discrimiant enum has one variant for each variant in the represented enum +and are ordered in the same way as in the represented enum. + +The fields union consists of fields corresponding to each variant in the +represented enum. Each field contains the fields from the corresponding variant +in the order defined in the variant. The valid field in the union is the one +that corresponds to the same variant that the discriminant enum's value +corresponds with. ```rust // This Enum has the same layout as #[repr(C)] -enum MyEnum { +enum RepresentedEnum { A(u32), B(f32, u64), C { x: u32, y: u8 }, @@ -265,37 +272,43 @@ enum MyEnum { // this struct. #[repr(C)] -struct MyEnumRepr { - tag: MyEnumTag, - payload: MyEnumPayload, +struct TaggedUnion { + tag: DiscriminantEnum, + payload: FieldsUnion, } +// This is the discriminant enum. #[repr(C)] -enum MyEnumTag { A, B, C, D } +enum DiscriminantEnum { A, B, C, D } +// This is the variant union. #[repr(C)] -union MyEnumPayload { - A: u32, - B: MyEnumPayloadB, - C: MyEnumPayloadC, - D: (), +union FieldsUnion { + A: FieldsA, + B: FieldsB, + C: FieldsC, + D: FieldsD, } #[repr(C)] -#[derive(Clone, Copy)] -struct MyEnumPayloadB(f32, u64); +struct FieldsA(u32); #[repr(C)] -#[derive(Clone, Copy)] -struct MyEnumPayloadC { x: u32, y: u8 } +struct FieldsB(f32, u64); + +#[repr(C)] +struct FieldsC { x: u32, y: u8 } + +#[repr(C)] +struct FieldsD; ``` Combining the `C` representation and a primitive representation is only defined for enums with fields. The primitive representation modifies the `C` representation by changing the representation of -the tag, e.g. `MyEnumTag` in the previous example, to have the representation of -the chosen primitive representation. So, if you chose the `u8` representation, -then the tag would have a size and alignment of 1 byte. +the discriminant enum to have the representation of the chosen primitive +representation. So, if you chose the `u8` representation, then the discriminant +enum would have a size and alignment of 1 byte. > Note: This representation was designed for primarily interfacing with C code > that already exists matching a common way Rust's enums are implemented in @@ -319,7 +332,7 @@ Combining two primitive representations together is unspecified. Combining the `C` representation and a primitive representation is described [above](#c-primitive-representation). -#### Primitive Fepresentation of Field-less Enums +#### Primitive Representation of Field-less Enums For [field-less enums], they set the size and alignment to be the same as the primitive type of the same name. For example, a field-less enum with