Calculate sizes, alignment and offsets for both wasm32 and wasm64 #1711

cpetig · 2024-08-03T11:09:08Z

This enables wit-bindgen to emit code like let l2 = *ptr0.add(core::mem::size_of::<*const u8>()).cast::<usize>(); which automatically calculates the right offset of 4 for wasm32 and 8 for wasm64 e.g. for string or list lowering.

The more complex cases combine fixed byte offsets and pointer sized offsets.

~~This also adds a compatibility layer which falls back to 32bit for the classical SizeAlign usage.~~

cpetig · 2024-08-03T11:16:30Z

This is a generalization of #1446

alexcrichton · 2024-08-12T18:49:35Z

Thanks for working on this! I've gotten a chance to review this now (thanks for the patience). I don't really understand how this works at a low-level and it feels like this form of generically handling size/align for 32/64 seems somewhat error-prone. It does seem like a nice property though to have where it would make code generation easier in the wit-bindgen and source languages.

To help evaluate this I whipped up a small differential fuzzer which generates arbitrary WITs and then asserts that the sizes reported by this branch are the same as the sizes reported on the main branch. That currently fails relatively quickly where ArchitectureSize looks like it's holding negative values as unsigned numbers which means that overflow is panicking on addition. Is that intended? If not that might be a bug?

Once that's sorted out I'd also like to extend the fuzzer a bit to call field_offsets and payload_offset for types (if it matches) to assert that those return values are the same as well.

For running the fuzzer I was locally doing:

FUZZER=wit64 cargo +nightly fuzz run -s none run

cpetig · 2024-08-13T18:52:57Z

Thank you for taking the time to fuzz it. I wouldn't fully trust code of this complexity either.

I found the difference: Resources are considered very large, but the new code considers them (very large)*2. Honestly, I had no idea about the correct way to handle this type, but perhaps it should use add_for_64bit: 0.

The code giving the problem:

TypeDefKind::Resource => ElementInfo::new(
                ArchitectureSize::new(usize::MAX, usize::MAX),
                Alignment::Bytes(NonZeroUsize::new(usize::MAX).unwrap()),
            ),

alexcrichton · 2024-08-13T18:58:51Z

I think resources should be unconditionally 32-bit values (always 4 bytes) regardless of wasm32 vs wasm64?

cpetig · 2024-08-13T19:04:23Z

I agree that resource handles should be 4+0, but this code calculates the size of a resource itself, which doesn't make much sense, see here https://github.com/bytecodealliance/wasm-tools/blob/main/crates/wit-parser/src/sizealign.rs#L61

I was simply at loss on which number to return in this unrealistic case.

alexcrichton · 2024-08-13T19:07:39Z

Aha I see! That looks like it's just something to update in the fuzzer then and skip TypeDef::Resource declarations within interfaces (makes sense in retrospect). Would you be up for developing the fuzzer a bit further to see if this PR's approach is viable? If the differential fuzzer is clean that would make me more confident in this approach personally and I can do a deeper review.

cpetig · 2024-08-13T19:11:47Z

I planned to use intmax+0 which would make the old and new values identical again, then add the field and payload offsets, hiding size calculation of resources would be unnecessary with the first change.

PS: I will for sure take inspiration from your fuzzer in testing other projects. Now I know the real purpose of wit-smith ;-)

alexcrichton · 2024-08-13T19:15:42Z

Ah it's ok to do whatever for TypeDefKind::Resource, it's an edge case needed for completeness but shouldn't show up practically in theory. This is just a fuzzer issue really and I think the way you map it right now is fine.

cpetig · 2024-08-13T19:50:58Z

Oh, after some time the fuzzer found another one: fuzz/artifacts/run/crash-84dc68e18872aad7a98cfcf676848973ef8aa1af

name: func(name: result<list<u16>, result<result<bool, result<result, result<s16>>>>>) -> result;

fuzz/src/wit64.rs:45:9:
assertion `left == right` failed
  left: 24
 right: 30

cpetig · 2024-08-13T21:16:21Z

See https://github.com/cpetig/wasm-tools/tree/fuzz64 for an updated variant of the fuzzer checking field offsets and payload offset. I assume that the types contained within other types are already part of the all types list, so no need to check the individual fields.

Now I will debug the failed case, it found another one fuzz/artifacts/run/crash-e4149b42d86d41bf7a4230735e54a801f5be70ef

xx: func(name: option<option<option<option<option<option<result<result<result<list<s32>, bool>, result<result<result<result<list<s32>, option<option<option<option<option<option<option<option<option<bool>>>>>>>>>>, bool>, bool>, bool>>, bool>>>>>>>, name1: bool, name2: bool, name3: bool, name4: bool);

cpetig · 2024-08-13T23:14:05Z

I found the problem (in my 64bit combination reasoning), pushed and fuzzer running again

cpetig · 2024-08-14T07:24:03Z

I stopped the fuzzer after several hours (3.4M tests, if I read correctly) without any difference detected.

alexcrichton

Ok thanks for fuzzing! That makes me more confident in this approach so I've done a bit of a deeper dive into this.

Do you think it would be possible to preserve the fuzzer? Basically having a separate "relatively simple" algorithm for calculating sizes with no frills and ensuring that this result continues to be the same?

crates/wit-parser/src/sizealign.rs

alexcrichton · 2024-08-15T15:22:52Z

crates/wit-parser/src/sizealign.rs

+/// Architecture specific measurement of position,
+/// the combined amount in bytes is
+/// `bytes + if 4 < core::mem::size_of::<*const u8>() { add_for_64bit } else { 0 }`
+#[derive(Default, Clone, Copy, Eq, PartialEq, Debug)]
+pub struct ArchitectureSize {
+    /// exact value for 32-bit pointers
+    pub bytes: usize,
+    /// amount of bytes to add for 64-bit architecture
+    pub add_for_64bit: usize,
+}


Would it be possible to represent this as something like bytes and pointers? That way the size could be bytes + pointers * sizeof(pointer). That I think may also make methods like pointers_to_add a bit more clear how to be implemented and/or unnecessary since it's just a field then.

I tried this once, encountered some difficulties and changed it to the current form. I will try again … I guess the current representation avoids multiplications and divisions and makes alignment more straightforward.

In theory it should be possible to skip mul/div by only having shifts left-and-right by storing the log2 of the pointer size (or something like that), but I'm mostly curious about this from an understandability point of view.

Oh, while implementing a more obscure test case I realized that I made a conceptual oversight:

I you take the max of 10 bytes and 2 pointers you get 4 bytes + 1.5 pointers - and I didn't take fractional pointer amounts into account, yet. It still can work if I multiply the number of pointers by 4 and handle fractional amounts properly. This wasn't visible in the fuzzing because of the additional alignment applied to the result before outputting the size.

Previously this example resulted in 10 bytes + 6 bytes_on_64_bit - if you start with 9 bytes you even get a quarter pointer amount (9 + 7) and (12+4) after alignment to pointer size.

Probably you don't want to express/represent partial pointer sizes for objects containing pointers, so you would want to align in the max function and return (8+1).

I was able to make it work, but align is a bit more complex than I would like it to be and max now uses the easy way out (calculate 32bit and 64bit then max and then separate again). Max can probably be made more efficient by expanding the 32bit/64bit calculation and then simplifying terms, but as it is mostly used for variants it should be less performance sensitive.

Fuzzer ran again without finding issues.

Thanks for pushing on this! I think at least from a code generator perspective it'll be easier to reason about size + pointers so I'm glad this was able to work out.

crates/wit-parser/src/sizealign.rs

cpetig · 2024-08-15T22:34:08Z

Do you think it would be possible to preserve the fuzzer? Basically having a separate "relatively simple" algorithm for calculating sizes with no frills and ensuring that this result continues to be the same?

I think the fuzzer could simply live in the new code and compare the new version with an older released version (flipping the crate alias roles).

alexcrichton · 2024-08-16T14:12:48Z

Oh excellent idea on fuzzing, I like that!

cpetig · 2024-08-17T17:39:12Z

Oh excellent idea on fuzzing, I like that!

I feel we might want to replace wit-smith in the fuzzer with the old version soon ... agreed?

Update: Already included in the latest patchset.

…nt logic

cpetig · 2024-08-18T09:16:40Z

@alexcrichton Do you want me to mark these threads as resolved or do you want to check first and mark then?

alexcrichton

Looks great to me, thanks again for working on this! I'll admit I still don't really understand how this works but I'm willing to trust the combo of you and the fuzzer, so let's merge and see how it goes.

alexcrichton · 2024-08-19T15:14:45Z

Oh and either way works for me on threads, thanks!

This notably updates to handle bytecodealliance/wasm-tools#1711 but this is done in a way that is not intended to replace bytecodealliance#1035 but instead makes the update "as easy as possible" by just adding calls to `.{size,align}_wasm32()` where needed. Full support is expected to land in bytecodealliance#1035.

* Update wasm-tools dependencies This notably updates to handle bytecodealliance/wasm-tools#1711 but this is done in a way that is not intended to replace #1035 but instead makes the update "as easy as possible" by just adding calls to `.{size,align}_wasm32()` where needed. Full support is expected to land in #1035. * More sizing updates

cpetig added 13 commits August 3, 2024 12:24

implement architecture adaptible data layout

f1cd098

make add_assign and align_to_arch available

84ad1e2

conversion function from alignment

2c9065b

new method to quickly add some bytes to it

f7fc0fb

implementing Display takes care of many errors in wit-bindgen

8a2ff05

more compatibility tweaks

e822feb

simplify code generation

9ac103f

more nice formatting, minor rename

a71cf41

shorten code by new()

286453f

fix logic error

cea9da8

document and introduce compatibility to previous versions

32c7181

remove merge artifact

6801320

make code compatible with 1.76.0

51d90f3

fix the fuzzer (resource returned different MAX value)

653130f

correct addition, simplify align, new test

c2ea06a

alexcrichton approved these changes Aug 15, 2024

View reviewed changes

cpetig added 3 commits August 17, 2024 19:11

add documentation

379db14

remove compatibility type alias

233794a

Add SizeAlign comparison fuzzer

20e4515

cpetig added 4 commits August 17, 2024 19:46

run rustfmt, use old wit-smith version

7800467

use pointers and bytes as separate entities

1d3ceba

add new testcase (found in fuzzing) and correct ptr to 8 byte alignme…

3218a8d

…nt logic

correct empty logic after pointer separation

1df7823

alexcrichton approved these changes Aug 19, 2024

View reviewed changes

alexcrichton added this pull request to the merge queue Aug 19, 2024

Merged via the queue into bytecodealliance:main with commit 5068b3a Aug 19, 2024
28 checks passed

alexcrichton mentioned this pull request Aug 22, 2024

Update wasm-tools dependencies bytecodealliance/wit-bindgen#1036

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculate sizes, alignment and offsets for both wasm32 and wasm64 #1711

Calculate sizes, alignment and offsets for both wasm32 and wasm64 #1711

cpetig commented Aug 3, 2024 •

edited

Loading

cpetig commented Aug 3, 2024

alexcrichton commented Aug 12, 2024

cpetig commented Aug 13, 2024

alexcrichton commented Aug 13, 2024

cpetig commented Aug 13, 2024

alexcrichton commented Aug 13, 2024

cpetig commented Aug 13, 2024

alexcrichton commented Aug 13, 2024

cpetig commented Aug 13, 2024 •

edited

Loading

cpetig commented Aug 13, 2024

cpetig commented Aug 13, 2024

cpetig commented Aug 14, 2024

alexcrichton left a comment

alexcrichton Aug 15, 2024

cpetig Aug 15, 2024

alexcrichton Aug 16, 2024

cpetig Aug 17, 2024

cpetig Aug 17, 2024 •

edited

Loading

cpetig Aug 17, 2024

cpetig Aug 17, 2024 •

edited

Loading

alexcrichton Aug 19, 2024

cpetig commented Aug 15, 2024

alexcrichton commented Aug 16, 2024

cpetig commented Aug 17, 2024 •

edited

Loading

cpetig commented Aug 18, 2024

alexcrichton left a comment

alexcrichton commented Aug 19, 2024

Calculate sizes, alignment and offsets for both wasm32 and wasm64 #1711

Calculate sizes, alignment and offsets for both wasm32 and wasm64 #1711

Conversation

cpetig commented Aug 3, 2024 • edited Loading

cpetig commented Aug 3, 2024

alexcrichton commented Aug 12, 2024

cpetig commented Aug 13, 2024

alexcrichton commented Aug 13, 2024

cpetig commented Aug 13, 2024

alexcrichton commented Aug 13, 2024

cpetig commented Aug 13, 2024

alexcrichton commented Aug 13, 2024

cpetig commented Aug 13, 2024 • edited Loading

cpetig commented Aug 13, 2024

cpetig commented Aug 13, 2024

cpetig commented Aug 14, 2024

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton Aug 15, 2024

Choose a reason for hiding this comment

cpetig Aug 15, 2024

Choose a reason for hiding this comment

alexcrichton Aug 16, 2024

Choose a reason for hiding this comment

cpetig Aug 17, 2024

Choose a reason for hiding this comment

cpetig Aug 17, 2024 • edited Loading

Choose a reason for hiding this comment

cpetig Aug 17, 2024

Choose a reason for hiding this comment

cpetig Aug 17, 2024 • edited Loading

Choose a reason for hiding this comment

alexcrichton Aug 19, 2024

Choose a reason for hiding this comment

cpetig commented Aug 15, 2024

alexcrichton commented Aug 16, 2024

cpetig commented Aug 17, 2024 • edited Loading

cpetig commented Aug 18, 2024

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton commented Aug 19, 2024

cpetig commented Aug 3, 2024 •

edited

Loading

cpetig commented Aug 13, 2024 •

edited

Loading

cpetig Aug 17, 2024 •

edited

Loading

cpetig Aug 17, 2024 •

edited

Loading

cpetig commented Aug 17, 2024 •

edited

Loading