Set body with byte reading support #1593

saschanaz · 2023-01-13T07:41:09Z

The switch is quite simple it seems 👀

At least two implementers are interested (and none opposed):
- Mozilla (already shipped)
- Google @ricea
- Apple? @annevk
Tests are written and can be reviewed and commented upon at:
- [Gecko Bug 1809673] Part 2: Add tests for BYOB readers from Response web-platform-tests/wpt#37910
Implementation bugs are filed:
- Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=1243329
- Gecko: (Shipped)
- WebKit: https://bugs.webkit.org/show_bug.cgi?id=250549
- Deno (not for CORS changes): Use readable byte stream for Blob.stream() and Response.body denoland/deno#17386
MDN issue is filed: Spec update: response.body now returns readable byte stream mdn/content#24453

(See WHATWG Working Mode: Changes for more details.)

domenic · 2023-01-13T07:53:45Z

You should probably have something similar to https://wicg.github.io/serial/#readable-attribute , which uses https://streams.spec.whatwg.org/#readablestream-current-byob-request-view to require that the implementation only pull the requested number of bytes from the network, and then write into the current BYOB request view when applicable.

saschanaz · 2023-01-13T08:19:45Z

Hmm, so currently Fetch has "If stream doesn’t need more data ask the user agent to suspend the ongoing fetch.", while Web Serial has:

Let desiredSize be the desired size to fill up to the high water mark for this.[[readable]].

If this.[[readable]]'s current BYOB request view is non-null, then set desiredSize to this.[[readable]]'s current BYOB request view's byte length.

I guess I should revise the sentence to also check whether the current byob request view is null and suspend the fetch if it is. Am I understanding correct?

And if it's correct, should need more data be revised for byte streams?

domenic · 2023-01-13T08:41:50Z

I guess I should revise the sentence to also check whether the current byob request view is null and suspend the fetch if it is. Am I understanding correct?

Not really.

I guess there are two incompatible models in play here, and we have to pick a path.

Rely on buffering somewhere else in the stack. In this model, we'd:
- Set highWaterMark = 0
- Assume that as data comes in from the network, it's getting stored in a buffer/queue somewhere else in the stack, e.g. kernel, networking library, etc.
- Wait for pulls. When we get a pull, either:
  - BYOB request is null, and so we create a new Uint8Array, grab whatever's in the other-part-of-the-stack buffer, put it into that Uint8Array, and enqueue it. (If nothing's there now, wait until something is.)
  - BYOB request is not null, and so we grab byobRequestView.byteLength bytes from the other-part-of-the-stack buffer, and write it into byobRequestView.
- Write out some vague spec text about how we're assuming that stuff accumulating in the elsewhere-in-the-stack buffer causes backpressure, and thus will automatically suspend/resume bytes accumulating in that buffer, probably in a gradual fashion due to how TCP works.
Rely on the stream's internal queue. In this model, we'd basically do what your current draft of the PR does, or the current spec does. We only use enqueue, ignoring the current BYOB request. We'd use "suspend" and "resume" primitives to interface with the network layer, based on when the stream's internal queue is too full (i.e., when "need more data" is false).

There is no real observable difference between these for JavaScript developers. So maybe (2) is fine, perhaps supplemented with a note. But in terms of how they communicate strategies to implementers, (1) seems better. With (2), if you implemented it verbatim, you'd be doing (unobservable) copies all over the place, from the Uint8Arrays that have accumulated in the stream's internal queue via enqueue, into the BYOB request buffers the web developer supplies.

This is a bit more complicated because from what I understand, (1) is not really implementable directly on top of low-level OS socket APIs. This is because in that framing, the "buffer somewhere elsewhere in the stack" is the kernel buffer, and that has a limit, so you could eventually lose data if backpressure isn't respected fast enough. So really there's a hidden middle layer which wraps the OS socket APIs with an unlimited buffer, I think. But it's been many years since I investigated this and I'm not sure I ever fully understood it, so, I might not have that part correct...

((2) is definitely not implementable directly on top of low-level APIs, because there is no "suspend" or "resume" socket API. In some sense that is bad, but in some sense it's at least honest about how high-level it is.)

Anyway, I think thoughts from implementers on what would be the most helpful would be appreciated. /cc @ricea

saschanaz · 2023-01-13T08:45:38Z

cc @evilpie and @mgaudet (and perhaps @jesup?)

ricea · 2023-01-13T11:40:59Z

@domenic (1) suits me, because it matches the way we'll actually implement it.

If you're reading directly from a kernel buffer and we're using TCP then the kernel will ensure that you don't lose data, applying backpressure if you stop reading for a while.

However, in practice we're almost always using TLS, so there's a TLS library also buffering between us and the kernel. It will also be careful not to drop data and will apply backpressure when its finite buffer is full, so no problem there.

If we're using UDP, then since this is Fetch I assume we're using QUIC. QUIC provides in-order reliable streams, so you won't lose data. I'm pretty sure backpressure works similarly to TCP within a stream.

In Chrome we have a bunch of extra abstraction layers, but I assume you're thinking about server software that runs closer to the OS.

saschanaz · 2023-01-19T16:46:57Z

Okay, so I asked around, checked the implementation, and I'm now confident that option 1 matches our behavior. I tried to spec that, except the backpressure thing. I think that one is more of the description for suspend/resume (and thus kinda out of scope of this PR).

Edit: The PR somehow copied "Let buffer an empty buffer that can have ... appended to it" from infra, and used "append" and "extract" against that without the definitions. Infra doesn't really define "append" against buffer either, so maybe it's okay?

fetch.bs

domenic · 2023-01-23T01:54:01Z

fetch.bs

+   <li>[=ReadableStream/Enqueue=] |view| into |stream|.
+
+   <li><p>If |stream| is [=ReadableStream/errored=], then [=fetch controller/terminate=]
+   |fetchParams|'s [=fetch params/controller=].


I don't think this can happen inside pull. I think it needs to be separate, something like, if the network blows up, terminate the controller and error the stream.

I guess below there is already a line that does that, but it doesn't error the stream? Is there some way in which terminating the controller errors the stream, which I haven't seen?

I'm not quite sure either and honestly just copied this from the previous steps which was also right after the enqueueing step.

Looking at the Gecko call diagram, it seems ReadableByteStreamControllerRespond can lead to ReadableByteStreamControllerEnqueueClonedChunkToQueue which can error the stream on a buffer clone failure. It's complex enough, I'd be happy if the spec can list the possible error reasons.

I think it needs to be separate, something like, if the network blows up, terminate the controller and error the stream.

Yes, but I think that's a lil bit out of scope here as I see no relevant existing step, right?

I guess if the lack of proper erroring step is a preexisting problem then we don't need to fix it here.

I do think this step makes very little sense, so I'd prefer to remove it.

Copypasting my question from Matrix: https://matrix.to/#/!AGetWbsMpFPdSgUrbs:matrix.org/$1y0cHr7B913RupwXNW93uLA06GT5Uls6nV8Ds8khB80?via=matrix.org&via=mozilla.org&via=igalia.com

r = new ReadableStream({ async pull(c) { await new Promise(r => setTimeout(r, 100)); c.byobRequest.respond(512); }, type: "bytes" }); reader = r.getReader({mode: "byob"}); reader.read(new Uint16Array(1024)); setTimeout(() => reader.releaseLock(), 5);

This eventually hits https://streams.spec.whatwg.org/#abstract-opdef-readablebytestreamcontrollerenqueueclonedchunktoqueue which theoretically can error the stream. Can Fetch really ignore this?

domenic · 2023-01-23T01:54:21Z

fetch.bs

+   |fetchParams|'s [=fetch params/controller=].
+
+   <li><p>If |stream| doesn't [=ReadableStream/need more data=], ask the user agent to
+   [=fetch/suspend=] the ongoing fetch.


In this model, we don't consult "need more data" at all. We should instead have something vague near the definition of buffer about how we expect that the buffer getting too full / too empty will suspend/resume.

Hmm, I was lazy and hoping I could just reuse the existing things... but you're right, since the HWM is now zero the desired size cannot be a positive number. I'll try adding some notes below buffer.

Actually I instead moved this step back to the fetching algorithm and made it to sleep if the buffer becomes larger than a user-agent defined limit. I think that works too?

(It's me who just don't want to describe about the network layer which is not exactly in my area)

domenic

I'm happy with this approach to the network-stack buffer, modulo my comment about a nonzero lower limit

domenic · 2023-01-27T03:10:59Z

fetch.bs

+
+   <li>Return |promise| and run the remaining steps [=in parallel=].
+
+   <li>If |buffer| is empty and the ongoing fetch is [=fetch/suspend|suspended=],


It shouldn't be required to be empty before you resume. So you should add a similar user-agent-defined lower limit at which it can resume.

domenic · 2023-01-27T03:11:51Z

fetch.bs

+  <ol>
+   <li>Let |promise| be [=a new promise=].
+
+   <li>Return |promise| and run the remaining steps [=in parallel=].


We've been avoiding this pattern these days, and instead using proper nesting.

In particular you want "Run these steps in parallel:", then the nested steps, and then "Return promise.".

domenic · 2023-01-27T03:13:44Z

fetch.bs

+
+   <li>Let |desiredSize| be |available|.
+
+   <li>If |stream|'s [=ReadableStream/current BYOB request view=] is non-null, then set


The current BYOB request view should not be accessed in parallel; nor should you create Uint8Arrays. You could probably fix this by posting a fetch task back after the waiting is done.

However, I'm not sure this is worth fixing, given the general problems with Streams being very JSey but being used even for no-JS-involved fetches (e.g. those who use a parallel queue). Thoughts from @annevk appreciated.

For now I fixed it to queue a task as it was simple enough, I can revert it if Anne disagrees.

I think it makes sense to do the right thing where we can, but I also don't mind if we take shortcuts when it cannot be observed. Although at some point we'll have to clean it all up.

domenic · 2023-01-27T03:14:57Z

fetch.bs

+   <li>[=ReadableStream/Enqueue=] |view| into |stream|.
+
+   <li><p>If |stream| is [=ReadableStream/errored=], then [=fetch controller/terminate=]
+   |fetchParams|'s [=fetch params/controller=].


I guess if the lack of proper erroring step is a preexisting problem then we don't need to fix it here.

I do think this step makes very little sense, so I'd prefer to remove it.

domenic

LGTM with nit

fetch.bs

annevk

I'll trust @domenic got the logic right. I have a few editorial comments and then it seems this can go in.

Will you file an MDN issue as well if needed?

fetch.bs

annevk · 2023-02-15T10:01:49Z

fetch.bs

+
+       <li><p>If |stream|'s [=ReadableStream/current BYOB request view=] is non-null, then set
+       |desiredSize| to |stream|'s [=ReadableStream/current BYOB request view=]'s [=BufferSource/byte
+       length=].


No newlines inside terms.

Hmm, that's not what the Great Rewrapper does, but okay.

If that's Dominic's tool that's a known issue I think. HTML has a different convention that cares less about inline search. I want inline search to be as straightforward as possible.

fetch.bs

saschanaz · 2023-02-15T10:38:36Z

The build failure doesn't seem relevant 🤔

saschanaz · 2023-02-15T10:53:43Z

Filed mdn/content#24453

annevk · 2023-02-15T13:29:40Z

This is blocked on speced/bikeshed#2471, but otherwise ready to go.

annevk · 2023-02-20T09:07:01Z

@saschanaz can you file an MDN issue? Also, thanks for making this happen!

saschanaz · 2023-02-20T09:59:05Z

Did it already! #1593 (comment)

annevk · 2023-02-20T10:32:16Z

Ah! So the reason I asked again is because I go by the checklist in OP as I easily forget. I updated that now to include that reference.

ReadableByteStream is a variant of ReadableStream specialized for bytes[1]. Given the performance benefits, this CL adds BYOB support for Fetch by making Response.body a byte stream to allow for reading with a bring-your-own-buffer(BYOB) reader. The corresponding spec PR for this was landed at whatwg/fetch#1593. Tests for reading from Blob with a BYOB reader were factored out, as support for that will be implemented in follow-up CLs. [1] https://streams.spec.whatwg.org/#readable-byte-stream Bug: 1243329 Change-Id: I381b9f2272a7f1202fa748ae5c039ca0a998de00

ReadableByteStream is a variant of ReadableStream specialized for bytes[1]. Given the performance benefits, this CL adds BYOB support for Fetch by making Response.body a byte stream to allow for reading with a bring-your-own-buffer(BYOB) reader. The corresponding spec PR for this was landed at whatwg/fetch#1593. Tests for reading from Blob with a BYOB reader were factored out, as support for that will be implemented in follow-up CLs. [1] https://streams.spec.whatwg.org/#readable-byte-stream Low-Coverage-Reason: Behavior changes are covered by WPTs (i.e. response-consume-stream.any.js). Bug: 1243329 Change-Id: I381b9f2272a7f1202fa748ae5c039ca0a998de00

ReadableByteStream is a variant of ReadableStream specialized for bytes[1]. Given the performance benefits, this CL adds BYOB support for Fetch by making Response.body a byte stream to allow for reading with a bring-your-own-buffer(BYOB) reader. The corresponding spec PR for this was landed at whatwg/fetch#1593. [1] https://streams.spec.whatwg.org/#readable-byte-stream Low-Coverage-Reason: Behavior changes are covered by WPTs (i.e. response-consume-stream.any.js and Blob-stream.any.js). Bug: 1243329, 1189621 Change-Id: I381b9f2272a7f1202fa748ae5c039ca0a998de00 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4573009 Commit-Queue: Nidhi Jaju <[email protected]> Reviewed-by: Kent Tamura <[email protected]> Reviewed-by: Adam Rice <[email protected]> Cr-Commit-Position: refs/heads/main@{#1152290}

Set body with byte reading support

ea9d056

Fixes whatwg#267

saschanaz mentioned this pull request Jan 13, 2023

Return a byte stream from "get stream" algorithm w3c/FileAPI#188

Merged

4 tasks

saschanaz mentioned this pull request Jan 13, 2023

Use readable byte stream for Blob.stream() and Response.body denoland/deno#17386

Open

Implement proper BYOB enqueue

8f98591

domenic reviewed Jan 23, 2023

View reviewed changes

saschanaz added 3 commits January 23, 2023 14:59

Sleep on enough buffer and wake up on empty buffer

e07ef8d

Prevent reentrance

c6130c0

Link "a new promise"

8441245

domenic reviewed Jan 27, 2023

View reviewed changes

saschanaz added 3 commits January 27, 2023 08:20

nested steps, a lower limit, and fetch task

09ce040

Merge branch 'main' into byte-stream

4f22afb

Run in parallel and then return promise

3c85e1f

annevk requested a review from domenic February 6, 2023 12:12

domenic approved these changes Feb 15, 2023

View reviewed changes

fetch.bs Show resolved Hide resolved

Update fetch.bs

23f9912

annevk reviewed Feb 15, 2023

View reviewed changes

Reformat

053d28b

saschanaz mentioned this pull request Feb 15, 2023

Spec update: response.body now returns readable byte stream mdn/content#24453

Closed

Reformat 2

5473365

annevk approved these changes Feb 15, 2023

View reviewed changes

annevk added the do not merge yet Pull request must not be merged per rationale in comment label Feb 15, 2023

saschanaz closed this Feb 16, 2023

saschanaz reopened this Feb 16, 2023

saschanaz mentioned this pull request Feb 16, 2023

CI test #1605

Closed

saschanaz closed this Feb 16, 2023

saschanaz reopened this Feb 16, 2023

saschanaz closed this Feb 17, 2023

saschanaz reopened this Feb 17, 2023

annevk removed the do not merge yet Pull request must not be merged per rationale in comment label Feb 20, 2023

annevk merged commit 67d4cde into whatwg:main Feb 20, 2023

saschanaz deleted the byte-stream branch February 20, 2023 09:58

annevk added topic: streams addition/proposal New features or enhancements labels Feb 20, 2023

saschanaz mentioned this pull request Feb 21, 2023

The fetch pull algorithm does not pass proper bytesWritten to ReadableByteStreamControllerRespond #1610

Closed

MattiasBuelens mentioned this pull request Feb 27, 2023

calling into_async_read panics whereas into_stream works well MattiasBuelens/wasm-streams#19

Closed

domenic mentioned this pull request Mar 24, 2023

Make response body streams readable byte streams #1246

Closed

3 tasks

KhafraDev mentioned this pull request May 14, 2023

fetch: set body with byte reading support nodejs/undici#2122

Closed

nidhijaju mentioned this pull request May 31, 2023

BYOB support for Fetch WebKit/standards-positions#194

Closed

chromium-wpt-export-bot mentioned this pull request Jun 1, 2023

Fetch: Support BYOB reading for Response.body web-platform-tests/wpt#40281

Closed

jimmywarting mentioned this pull request Oct 21, 2023

Use readable byte stream for Blob.stream() and Response.body oven-sh/bun#6643

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set body with byte reading support #1593

Set body with byte reading support #1593

saschanaz commented Jan 13, 2023 •

edited by annevk

Loading

domenic commented Jan 13, 2023

saschanaz commented Jan 13, 2023

domenic commented Jan 13, 2023

saschanaz commented Jan 13, 2023 •

edited

Loading

ricea commented Jan 13, 2023

saschanaz commented Jan 19, 2023 •

edited

Loading

domenic Jan 23, 2023

saschanaz Jan 23, 2023

domenic Jan 27, 2023

saschanaz Jan 27, 2023

domenic Jan 23, 2023

saschanaz Jan 23, 2023

saschanaz Jan 23, 2023

domenic left a comment

domenic Jan 27, 2023

domenic Jan 27, 2023

annevk Feb 6, 2023

domenic Jan 27, 2023

saschanaz Jan 27, 2023

annevk Feb 6, 2023

domenic Jan 27, 2023

domenic left a comment

annevk left a comment

annevk Feb 15, 2023

saschanaz Feb 15, 2023

annevk Feb 15, 2023

saschanaz commented Feb 15, 2023

saschanaz commented Feb 15, 2023

annevk commented Feb 15, 2023

annevk commented Feb 20, 2023

saschanaz commented Feb 20, 2023

annevk commented Feb 20, 2023


		<li>Return \|promise\| and run the remaining steps [=in parallel=].

		<li>If \|buffer\| is empty and the ongoing fetch is [=fetch/suspend\|suspended=],


		<li>Let \|desiredSize\| be \|available\|.

		<li>If \|stream\|'s [=ReadableStream/current BYOB request view=] is non-null, then set

Set body with byte reading support #1593

Set body with byte reading support #1593

Conversation

saschanaz commented Jan 13, 2023 • edited by annevk Loading

domenic commented Jan 13, 2023

saschanaz commented Jan 13, 2023

domenic commented Jan 13, 2023

saschanaz commented Jan 13, 2023 • edited Loading

ricea commented Jan 13, 2023

saschanaz commented Jan 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

domenic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

domenic left a comment

Choose a reason for hiding this comment

annevk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saschanaz commented Feb 15, 2023

saschanaz commented Feb 15, 2023

annevk commented Feb 15, 2023

annevk commented Feb 20, 2023

saschanaz commented Feb 20, 2023

annevk commented Feb 20, 2023

saschanaz commented Jan 13, 2023 •

edited by annevk

Loading

saschanaz commented Jan 13, 2023 •

edited

Loading

saschanaz commented Jan 19, 2023 •

edited

Loading