Figure out if anything is needed for better HTML integration #128

annevk · 2017-12-12T13:07:02Z

@hsivonen in whatwg/html#1077 (comment) raised a number of issues with the current integration points. They are insufficient for CSS, HTML, and presumably XML.

This might require some substantive changes to the hooks and perhaps other parts of the Encoding Standard, as well as standards that depend on the Encoding Standard (of which there are quite a few, so tread carefully).

Belated filing this to keep better track of it.

annevk · 2018-01-13T13:07:42Z

I think I'd personally be okay if the standard just said that you had to wait for 1024 bytes before decoding and if you could optimize around that, it would be okay too. The difference should only be observable performance-wise, which seems acceptable. And we can encourage implementations to do the fast thing.

I think that remains true if we add encoding sniffing.

Rewriting the specifications to have the proper abstractions would be somewhat nicer obviously, but seems like a lot more effort.

Note that we still have to change "decode" to also return the chosen encoding to the caller (and adjust any callers as appropriate).

andreubotella · 2020-03-09T15:17:14Z

I'm reopening the discussion about this feature in whatwg/html#1077 (comment)

This change moves the BOM splitting part of the decode hook into a separate hook which does not consume any bytes of the token stream. This will allow fixing a long-standing issue in the HTML encoding sniffing algorithm with the document's character encoding being set to the wrong result when there is a BOM: whatwg/html#1077. Closes #128.

annevk mentioned this issue Dec 12, 2017

Two "streams" confusion ricea/encoding-streams#1

Open

annevk mentioned this issue Jan 13, 2018

Getting all bytes in a body whatwg/fetch#661

Closed

andreubotella mentioned this issue Mar 16, 2020

Add a BOM sniffing hook for better integration with HTML #203

Merged

annevk closed this as completed in #203 Mar 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out if anything is needed for better HTML integration #128

Figure out if anything is needed for better HTML integration #128

annevk commented Dec 12, 2017

annevk commented Jan 13, 2018

andreubotella commented Mar 9, 2020

Figure out if anything is needed for better HTML integration #128

Figure out if anything is needed for better HTML integration #128

Comments

annevk commented Dec 12, 2017

annevk commented Jan 13, 2018

andreubotella commented Mar 9, 2020