Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process subresource link headers #1409

Closed
wants to merge 1 commit into from

Conversation

noamr
Copy link
Contributor

@noamr noamr commented Mar 9, 2022

In conjunction with whatwg/html#7691
(see there for implementers/tests etc)

  • At least two implementers are interested (and none opposed):
  • Tests are written and can be reviewed and commented upon at:
  • Implementation bugs are filed:
    • Chrome: …
    • Firefox: …
    • Safari: …
    • Deno (not for CORS changes): …

(See WHATWG Working Mode: Changes for more details.)


💥 Error: 500 Internal Server Error 💥

PR Preview failed to build. (Last tried on Mar 9, 2022, 3:32 PM UTC).

More

PR Preview relies on a number of web services to run. There seems to be an issue with the following one:

🚨 CSS Spec Preprocessor - CSS Spec Preprocessor is the web service used to build Bikeshed specs.

🔗 Related URL

If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.

@noamr noamr closed this Mar 9, 2022
@noamr noamr reopened this Mar 9, 2022
@annevk
Copy link
Member

annevk commented Mar 10, 2022

Do we only test this for font resources currently? I saw https://github.com/web-platform-tests/wpt/blob/master/preload/link-header-on-subresource.html linked from the HTML PR.

This would also mean that pretty much any request you make can result in further requests, including a simple fetch(). Is that really what is implemented?

Also, it seems this should be processed as part of the same task that will run "process response". I don't think we want to introduce another moment in time for this.

@noamr
Copy link
Contributor Author

noamr commented Mar 10, 2022

Do we only test this for font resources currently? I saw https://github.com/web-platform-tests/wpt/blob/master/preload/link-header-on-subresource.html linked from the HTML PR.

I will add more, you are right.

This would also mean that pretty much any request you make can result in further requests, including a simple fetch(). Is that really what is implemented?

I need to double check. I believe so.

Also, it seems this should be processed as part of the same task that will run "process response". I don't think we want to introduce another moment in time for this.

OK

@annevk
Copy link
Member

annevk commented Mar 11, 2022

I see, I raised this before in w3c/preload#148 but that got closed by whatwg/html#7622 which doesn't quite address this. (E.g., I think as defined a Link header on a style sheet creates different kind of fetches (with respect to Referer for instance) than subresources the style sheet might link.)

I'm rather concerned about a subresource fetch resulting in further fetches. That's simply not a side effect you'd expect when pulling a resource from elsewhere. @yoavweiss do you know where this feature got discussed before including the security implications, how this should relate to CSP, Referer headers, etc?

cc @domenic

@noamr
Copy link
Contributor Author

noamr commented Mar 11, 2022

I see, I raised this before in w3c/preload#148 but that got closed by whatwg/html#7622 which doesn't quite address this. (E.g., I think as defined a Link header on a style sheet creates different kind of fetches (with respect to Referer for instance) than subresources the style sheet might link.)

I'm rather concerned about a subresource fetch resulting in further fetches. That's simply not a side effect you'd expect when pulling a resource from elsewhere. @yoavweiss do you know where this feature got discussed before including the security implications, how this should relate to CSP, Referer headers, etc?

I would expect a stylesheet I fetch to import other stylesheets and continue to do so recursively, same with scripts... I don't see what makes link headers different in that regard.

@annevk
Copy link
Member

annevk commented Mar 11, 2022

Right, for style sheets and scripts the main concern I have is around fetching logic (do all the bits and in particular Referer end up being set correctly). For fetch() (and also <img>) the concern is more about application security. In that there are now side effects where previously there were none.

@noamr
Copy link
Contributor Author

noamr commented Mar 11, 2022

Right, for style sheets and scripts the main concern I have is around fetching logic (do all the bits and in particular Referer end up being set correctly). For fetch() (and also <img>) the concern is more about application security. In that there are now side effects where previously there were none.

Though that's less specific to the recursive aspect, isn't it? An image with a link header would have a side-effect regardless of it being recursive.
Perhaps resources should only be allowed to trigger preloads to the kinds of resources it might end up fetching? (maybe this means that only styles & scripts can have preload headers)

@noamr
Copy link
Contributor Author

noamr commented Mar 13, 2022

I've added tests.

The current situation:

  • Both Safari & Chrome support recursive subresource link headers
  • Safari doesn't support subresource link headers on fonts
  • Firefox doesn't support subresource link headers (the current relevant test fails)

I'd love to be able to reach some consensus on how to proceed with this.

@yoavweiss
Copy link
Collaborator

@yoavweiss do you know where this feature got discussed before including the security implications, how this should relate to CSP, Referer headers, etc?

@annevk - this was discussed at the time (~2015, IIRC), but no particular concerns were raised. (some concerns were raised later)
If there are security/privacy issues with this, we can rediscuss. May be interesting to see how often this is used in Chromium, but in any case, breaking this is unlikely to result in compat issues, as Link headers can't define load/error event handlers.

With regards to why this is supported, I can see a clear use case for active content preloading depedent subresources (e.g. a script loading a dependent script it knows it'll need, or a CSS preloading a dependent BG image of font). I see less of a use case for passive content (e.g. images), so would be more open to disabling preloads there.

@noamr
Copy link
Contributor Author

noamr commented Mar 14, 2022

@yoavweiss do you know where this feature got discussed before including the security implications, how this should relate to CSP, Referer headers, etc?

@annevk - this was discussed at the time (~2015, IIRC), but no particular concerns were raised. (some concerns were raised later) If there are security/privacy issues with this, we can rediscuss. May be interesting to see how often this is used in Chromium, but in any case, breaking this is unlikely to result in compat issues, as Link headers can't define load/error event handlers.

With regards to why this is supported, I can see a clear use case for active content preloading depedent subresources (e.g. a script loading a dependent script it knows it'll need, or a CSS preloading a dependent BG image of font). I see less of a use case for passive content (e.g. images), so would be more open to disabling preloads there.

I can see how this would be a security/privacy concern. Thinking of a document with no-cors images only and no CSP, images might generate link headers to CORS resources, causing unexpected fetches and perhaps exfiltration.

@annevk annevk added security/privacy There are security or privacy implications addition/proposal New features or enhancements labels Mar 15, 2022
@yoavweiss
Copy link
Collaborator

@annevk - do you agree with my argument that this is fine for active resources (e.g. scripts, styles)? If so, that could be a path forward and I can see if there's implementation interest on the Chromium side to remove that support for passive resources.

@noamr
Copy link
Contributor Author

noamr commented Apr 9, 2022

@annevk - do you agree with my argument that this is fine for active resources (e.g. scripts, styles)? If so, that could be a path forward and I can see if there's implementation interest on the Chromium side to remove that support for passive resources.

I can see how we would specify it as a map of which request destination can load which other request destinations.
Something like:
script / '' (fetch/XHR etc)-> any
style -> style / font / image

@annevk
Copy link
Member

annevk commented Apr 20, 2022

@yoavweiss I think it might be possible to make that split, yeah. See #1409 (comment) for the remaining issues.

Perhaps that also argues for making the processing happen on the endpoints and not in Fetch directly?

@noamr
Copy link
Contributor Author

noamr commented Apr 20, 2022

@yoavweiss I think it might be possible to make that split, yeah. See #1409 (comment) for the remaining issues.

Perhaps that also argues for making the processing happen on the endpoints and not in Fetch directly?

Yes, maybe fetch doesn't need to be involved at all. Will play around with doing this at the caller sites

@yoavweiss
Copy link
Collaborator

@annevk - good point in the comment above about setting the CSS resource as the Referer in case of resources preloaded as Link headers on style resources. I don't think that's how this is currently implemented, but it probably should.

@noamr
Copy link
Contributor Author

noamr commented Apr 20, 2022

So it's mainly the call sites for script/style. But I wonder about the call site at fetch / xhr - it's plausible that a fetch() call would have a side effect of fetching other resources - but do we consider it a subresource or should we keep it simple with only scripts / styles? Maybe this can be done iteratively

@annevk
Copy link
Member

annevk commented Apr 20, 2022

I don't think fetch() (or XMLHttpRequest) should end up fetching other resources besides the one requested (except perhaps as part of a protocol feature, such as H/2 Push). That seems rather unexpected.

@yoavweiss
Copy link
Collaborator

Yeah, fetch() is a theoretical use case, but I'm not sure I have clear examples of it that can't be resolved by having the script that triggers the first fetch simply trigger other fetches in parallel.

@noamr
Copy link
Contributor Author

noamr commented Apr 20, 2022

Yeah, fetch() is a theoretical use case, but I'm not sure I have clear examples of it that can't be resolved by having the script that triggers the first fetch simply trigger other fetches in parallel.

I was thinking of scenarios where you fetch() the script before executing it, fetch a JSON file that's going to eventually have side-effects that would have you load other resources, have a binary response for WASM that's going to end up downloading more resources in the future etc.

But I agree that starting with scripts & style is good (and having module scripts serve subresource modulepreload)

@annevk
Copy link
Member

annevk commented Apr 20, 2022

To be clear, I think fetch() acting on Link headers in such a way would be a rather egregious layering violation. Doing it for a select set of endpoints that already can end up resulting in multiple fetches is quite reasonable however.

@noamr
Copy link
Contributor Author

noamr commented Apr 20, 2022

To be clear, I think fetch() acting on Link headers in such a way would be a rather egregious layering violation. Doing it for a select set of endpoints that already can end up resulting in multiple fetches is quite reasonable however.

I'm not sure it's a layering violation but I also don't think it's a strong enough use case, so the end result is a consensus AFAIC :)

@@ -4272,6 +4273,11 @@ steps:
</ol>
</li>

<li><p>If <var>request</var> is a <a>subresource request</a> and <var>request</var>'s
<a for=request>window</a> is an <a>environment settings object</a>, then
<a>queue a fetch task</a> to run <span>process subresource link headers</span> given
Copy link
Member

@domenic domenic Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"process subresource link headers" is not defined anywhere. I see, it is in the HTML PR. Relatedly, span does not cause cross-linking in Bikeshed specs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to do an overhaul of this PR, ignore for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea when this will be overhauled?

I'm planning to do a follow-up change: I'd like to pass to the "process subresource link headers" algorithm whether the initiator subresource is render-blocking or not, so that we can create a render-blocking chain. It will depend on how this PR proceeds, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will get to it once whatwg/html#7866 is in.
Also, I'm not sure how well tested the current support for blocking is in link headers, and I'm sure it's not spec'ed for early hints. I'm not sure if you wanted your follow up to include early hints, but if you do, first early hints have to support blocking. I would suggest to have the WPTs ready for those use-cases before we get to subresources.

Copy link
Contributor Author

@noamr noamr May 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaochengh do/can you have an open issue for the render-blocking chain?
I have some questions/doubts about it, it would save us time later if we could discuss them now :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll file an issue for render-blocking chain, and also see how it should work with early hints. Thanks for the note!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed whatwg/html#7899

Regarding early hints: I think blocking should be ignored on early hints, because there's no document to block when early hints are processed. It suffices as long as the actual response makes the document block on the early-hint-preloaded resources.

@noamr
Copy link
Contributor Author

noamr commented Apr 24, 2022

Thinking about this again, I believe this should be done inside fetch and not at the call sites, as this should also work with preloads - <link rel=preload as=style> where the response has link headers should work the same as <link rel=stylesheet> and @import in CSS, and putting this in each of the call sites would require something like "fetch stylesheet" and above fetch that preload knows about.

Also, we need to be careful not to allow preload semantics that styles are not capable of creating themselves - e.g. the blocking attribute or loading fonts with a crossorigin attribute that's not anonymous.

@annevk
Copy link
Member

annevk commented Apr 26, 2022

Sigh.

So that means preload is not a generic fetch, but would be a highly-specific fetch based on as. That also argues for reverting whatwg/html#7799 I think.

I think it would actually be better if we had "fetch a style sheet" and "fetch a script" wrappers if we decided to go down that road. Offloading all the complexity into Fetch isn't great long term.

cc @domenic @hiroshige-g

@domenic
Copy link
Member

domenic commented Apr 26, 2022

Yeah, I am really surprised at the claims like

should work the same as <link rel=stylesheet> and @import in CSS,

and

we need to be careful not to allow preload semantics that styles are not capable of creating themselves

Why can't we make Link headers a generic fetch primitive? Why do they need these constraints?

@annevk
Copy link
Member

annevk commented Apr 26, 2022

@domenic see above, all fetches acting on Link headers would be a layering violation. That's not what you want for a low-level primitive. Link headers are an application concern.

@domenic
Copy link
Member

domenic commented Apr 26, 2022

I don't think I really bought that argument, but even if we go that direction, I don't understand why it has to be so type-specific. Basically I would not expect the details of CSS to leak into rel=preload; I would expect them to be in something like rel=stylesheetpreload, but not rel=preload.

@noamr
Copy link
Contributor Author

noamr commented Apr 26, 2022

I don't think I really bought that argument, but even if we go that direction, I don't understand why it has to be so type-specific.

I agree that you wouldn't expect that <img src=""> would have the side effect of sending additional fetches.

Basically I would not expect the details of CSS to leak into rel=preload; I would expect them to be in something like > rel=stylesheetpreload, but not rel=preload.

Yes I can see that. Perhaps we could start by allowing this in modulepreload and defer the stylesheet issue. @yoavweiss?

@domenic
Copy link
Member

domenic commented Apr 26, 2022

To be clear, my proposal (not sure if it's helpful) is that rel=preload not have any restrictions. blocking="" always works, crossorigin="" can be any value, etc. If this allows you to create fetches you could not create with @import or @font-face, that's fine. Maybe that'll cause some cache misses, oh well.

@noamr
Copy link
Contributor Author

noamr commented Apr 26, 2022

To be clear, my proposal (not sure if it's helpful) is that rel=preload not have any restrictions. blocking="" always works, crossorigin="" can be any value, etc. If this allows you to create fetches you could not create with @import or @font-face, that's fine. Maybe that'll cause some cache misses, oh well.

I see, I'm Ok with this. But are you proposing to also allow that for things that are not styles/scripts?

@domenic
Copy link
Member

domenic commented Apr 26, 2022

But are you proposing to also allow that for things that are not styles/scripts?

My preference would be to allow it for all fetches, but I understand @annevk is against it, and I don't feel strongly. (Especially since it's always easier to start conservative.)

So the question is which fetches allow it. I can think of a few options:

  • Everything but fetch()
  • Navigations + <link> + <script>
  • Navigations + <link rel=preload as=stylesheet> + <link rel=stylesheet>
  • ... various other permutations ...

I'm not sure which option the various participants in this discussion want to go for. Or what is most useful. Or how that plays out in terms of spec layering. But maybe nailing that down is the next step? I guess you made an initial proposal at #1409 (comment) but I'm not sure if everyone got on board with that... apologies if they did and I'm just confusing matters.

@noamr
Copy link
Contributor Author

noamr commented Apr 26, 2022

But are you proposing to also allow that for things that are not styles/scripts?

My preference would be to allow it for all fetches, but I understand @annevk is against it, and I don't feel strongly. (Especially since it's always easier to start conservative.)

So the question is which fetches allow it. I can think of a few options:

  • Everything but fetch()
  • Navigations + <link> + <script>
  • Navigations + <link rel=preload as=stylesheet> + <link rel=stylesheet>
  • ... various other permutations ...

I'm not sure which option the various participants in this discussion want to go for. Or what is most useful. Or how that plays out in terms of spec layering. But maybe nailing that down is the next step? I guess you made an initial proposal at #1409 (comment) but I'm not sure if everyone got on board with that... apologies if they did and I'm just confusing matters.

I can go with an option where if the destination of the request is script it can process any link header, and if it's style it can process any as=font/as=img/as=style link header, allowing all the link semantics. It's not more layer-violating than CSP as it only deals with request destinations.

@xiaochengh
Copy link
Contributor

Just a side note that this will be very useful for blocking=render: If a stylesheet can render-blockingly preload its font faces, then we can fully eliminate font-caused FOUT and layout shifts; And this works particularly well for 3rd party font providers, in which case developers only have urls of the font stylesheets but not the exact font files.

So I would love to see this getting landed soon.

@domenic
Copy link
Member

domenic commented Apr 28, 2022

I can go with an option where if the destination of the request is script it can process any link header, and if it's style it can process any as=font/as=img/as=style link header, allowing all the link semantics. It's not more layer-violating than CSP as it only deals with request destinations.

OK, so concretely, Fetch would contain this logic, which dispatches to HTML's "process link headers for subresources" which just assumes that if it's called it's allowed to do full Link processing. (Maybe it doesn't even need to be subresource-specific.)

That sounds pretty elegant to me. @annevk are you on board?

@noamr
Copy link
Contributor Author

noamr commented Apr 29, 2022

I can go with an option where if the destination of the request is script it can process any link header, and if it's style it can process any as=font/as=img/as=style link header, allowing all the link semantics. It's not more layer-violating than CSP as it only deals with request destinations.

OK, so concretely, Fetch would contain this logic, which dispatches to HTML's "process link headers for subresources" which just assumes that if it's called it's allowed to do full Link processing. (Maybe it doesn't even need to be subresource-specific.)

process link headers for subresources would need a list of allowed destinations, but otherwise that's the idea.
Perhaps this can still be totally inside HTML, make this check on style/script/preload-as-script/style response (with a suitable as). It does create some exception to the rule that preload is network only, but maybe that's OK.

@annevk
Copy link
Member

annevk commented Apr 29, 2022

Can the processing of these Link elements have any other side effects that need to be dealt with by the original caller of fetch? E.g., for documents Link: rel=stylesheet affects document.styleSheets. Based on that, it seems better if the caller of fetch is responsible for processing these, as they are not really a pure networking feature.

@noamr
Copy link
Contributor Author

noamr commented Apr 29, 2022

Can the processing of these Link elements have any other side effects that need to be dealt with by the original caller of fetch? E.g., for documents Link: rel=stylesheet affects document.styleSheets. Based on that, it seems better if the caller of fetch is responsible for processing these, as they are not really a pure networking feature.

Right, they feel more like an HTML feature. I'll prepare an HTML patch to handle these at a few of the call sites.

@noamr
Copy link
Contributor Author

noamr commented Apr 29, 2022

I want whatwg/html#7866 to go in first, and to start relying on the "pending document" concept where all link header processing has a "document promise" of sorts which it awaits at the moment where it really needs a document.

Otherwise I need to keep special-casing early hints and it's getting a bit out of hand

@annevk
Copy link
Member

annevk commented Feb 8, 2023

@noamr am I correct in assuming that this can now be closed?

@noamr
Copy link
Contributor Author

noamr commented Feb 8, 2023

@noamr am I correct in assuming that this can now be closed?

No. The spec still doesn't deal with subresource link headers while implementations do something with them.
I am not currently pursuing this though, not sure how I feel about this feature and whether it's actually helpful.

@noamr
Copy link
Contributor Author

noamr commented Feb 8, 2023

Closing for now, if someone wants to pursue spec'ing this feature it would probably require a new PR anyway.

@noamr noamr closed this Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements security/privacy There are security or privacy implications
Development

Successfully merging this pull request may close these issues.

5 participants