Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching hint responses with requests #5

Closed
willchan opened this issue Jul 9, 2014 · 27 comments
Closed

Matching hint responses with requests #5

willchan opened this issue Jul 9, 2014 · 27 comments

Comments

@willchan
Copy link

willchan commented Jul 9, 2014

Caching is somewhat of an implementation detail that might not belong directly in the spec, since there aren't any cache directives here beyond those that already exist in HTTP. Rather, we may need to specify more cleanly the separation between downloading and "processing" a resource.

The way this is written assumes a lot of characteristics of a user agent implementation that aren't specified. For example, the Blink uses an in memory cache that sits on top of the browser HTTP cache. 300 seconds may make sense for now, but if this feature gets heavily used, not to mention people start prefetching huge resources that a user agent doesn't want to keep all in memory (what if you prefetch a video?), then obviously the user agent will just evict the entry. Or actually, we'd probably apply some sort of flow control to only prefetch up to a certain number of bytes per resource.

I think specifying it less and just saying that the user agent should try to keep downloaded resources around until they are actually processed, within resource constraints amongst other constraints, is probably enough. And maybe offer a suggested minimum time period to try to maintain.

@igrigorik
Copy link
Owner

Great points, consistent with feedback from @mnot as well. Took a first run:
https://igrigorik.github.io/resource-hints/#matching-request

How does that look? I'm sure the terminology and wording could use some massaging.

@willchan
Copy link
Author

I'm not enough of an expert in the relevant specs to know whether "navigation context" is sufficient. But I'm sure that eventually some expert will chime in and go all spec lawyer on you here.

My only other nit is to explain the 300 seconds. Basically, you should say that the user agent should try to keep the response alive until processed, and that discarding after 300 seconds might be a valid heuristic. Maybe my suggestion isn't necessary because in your new wording, you explain at the beginning of the section already. I dunno, I'm not a good editor :) Up to you.

If you want to close this issue now, I'd be fine with that.

@igrigorik
Copy link
Owner

I'm fairly certain that navigation context is, in fact, not the right term.. But I'm not sure what the right one is either. In Blink-speak, I want to say "renderer", but I'm not sure that will fly either.

@willchan
Copy link
Author

Yeah, "renderer" will be wrong, because of cross process navigation, where we move browsing context or whatever into a new process. Actually, I think we're moving more and more of that browsing context into the browser process instead, due to these cross process navigation issues and site isolation sandboxing.

In any case, some standardsy / speccy person will come up with the right term eventually :)

@mnot
Copy link

mnot commented Jul 11, 2014

Better, but I agree with willchan that 300 secs is just a suggestion for the heuristic, not a hard requirement.

@annevk, help on the precise language?

@igrigorik
Copy link
Owner

Relevant bits from public-webperf: http://lists.w3.org/Archives/Public/public-web-perf/2014Jul/0089.html

@willchan @annevk @mcmanus any suggestions for wording? This is what I have so far:
https://igrigorik.github.io/resource-hints/#matching-request

@annevk
Copy link

annevk commented Jul 27, 2014

It seems to properly define this you need to define a Document-wide cache of sorts. We also need to start defining that for other purposes, to keep track of what fetches a Document (or a Worker for that matter) does and such.

It seems the scope would be the global environment (request's client), though all that is still a bit unstable.

Also, in non-normative text such as a Note you do not want to use normative terms such as SHOULD.

@mnot
Copy link

mnot commented Jul 28, 2014

@annevk
Copy link

annevk commented Jul 28, 2014

The image URL cache is a bit different and more sad. It's also not limited to a single environment as I understand it.

@igrigorik
Copy link
Owner

(thinking out loud...)

Its the prefetch/prerender that's tricky to define...

  • "Resources fetched via preload are retained by the user agent until they are processed by a matching request from the same navigation context." - preloads are initiated for current page and occur in the same renderer/process/context/etc. I think this works...
  • "Resources fetched via prefetch/prerender are retained by the user agent until they are matched with a request." - the matching request can originate in a different renderer/process/context/etc.

What if I split the definitions along the above lines? The benefit is that it clarifies when the retained preload requests can be purged, but still provides enough flexibility for prerender/prefetch.

Alternatively, just drop the 'from the same navigation context' part entirely and keep it same for all three?

@willchan
Copy link
Author

@annevk Can you explain in further detail why it's necessary to spec out a Document wide cache? I'd ideally hope to avoid speccing that, and handwave over it. Basically, it's a lot of work to spec, and I don't want to unnecessarily constrain implementations.

@annevk
Copy link

annevk commented Jul 29, 2014

Well, various drafts from the performance WG track all the requests issued by a document. A document also passes all its requests through a service worker. Once a document is destroyed in some way all those fetches are terminated. That indicates there is some tracking going on already which we need to define. It seems this feature simply plugs in that model.

@willchan
Copy link
Author

OK, I think I misunderstood. You are more specifically talking about how requests are tracked. Sure, I agree that it's worthwhile to specify them more precisely. The HTML spec already hints at this in some locations. I'm mostly concerned about specifying another level of cache semantics.

@igrigorik
Copy link
Owner

@willchan @annevk given that this stuff is still all a bit hazy.. What's reasonable language in the meantime? Does my earlier suggestion make any sense: #5 (comment) ?

@annevk
Copy link

annevk commented Jul 29, 2014

"Resources fetched via preload/prefetch/prerender are retained by the user agent until they are [fetched] with a matching request" seems okayish. I think what you want to define here is what you consider matching. Does the method need to be the same? Just the URL? Headers? There's a ton of variables fetches have and more and more will be exposed to developers.

@igrigorik
Copy link
Owner

@annevk I was hoping I wouldn't have to define "matching" here.. we don't have anything in Fetch spec or elsewhere? Seems like HTTP spec should have a definition? /cc @mnot

@annevk
Copy link

annevk commented Jul 30, 2014

Perhaps HTTP has something. Not sure if that's interoperable, but maybe one day it could be.

@willchan
Copy link
Author

So this whole concept of "matching" is complicated, and especially poorly matched across user agents. Here's some discussion on Chromium's net-dev mailing list.

@igrigorik
Copy link
Owner

Update: 7768cb2, live: https://igrigorik.github.io/resource-hints/#matching-request

WDYT? I'm intentionally keeping "matching" vague since it seems to be UA specific.

@annevk
Copy link

annevk commented Jul 31, 2014

I'd like you to specifically call that out, via a note or some such and perhaps an XXX comment in the source that it needs fixing at some point. We should strive to get rid of UA-specific things. Calling them out as problematic is a good first step.

@igrigorik
Copy link
Owner

@annevk makes sense. how does that look?

@annevk
Copy link

annevk commented Jul 31, 2014

E.g. "Note: matching is not currently defined or very interoperable across user agents. [Link] hopes to improve that." with [Link] pointing to some open issue somewhere.

@igrigorik
Copy link
Owner

@annevk any recommendations for where to kick this off, or who would be a good person to drive it? Perhaps this is something that could live in Fetch? Intuitively, seems like a good place for it?

Also, any thoughts on using internal? (see my earlier comment)

@annevk
Copy link

annevk commented Aug 1, 2014

Isn't internal a different bug?

I'm not sure where matching should live. @jakearchibald is defining something for service worker's storage thingie. It seems the lowest level thing that requires it is either that service worker thing or the HTTP cache.

@igrigorik
Copy link
Owner

Not exactly what we're after, but a useful reference nonetheless (courtesy of @mnot):
http://httpwg.github.io/specs/rfc7234.html#constructing.responses.from.caches

@mnot
Copy link

mnot commented Aug 6, 2014

Better hit in fetch step 11:

If the resource is identified by an absolute URL, and the resource is to be obtained using an idempotent action (such as an HTTP GET or equivalent), and it is already being downloaded for other reasons (e.g. another invocation of this algorithm), and this request would be identical to the previous one (e.g. same Accept and Origin headers), and the user agent is configured such that it is to reuse the data from the existing download instead of initiating a new one, then use the results of the existing download instead of starting a new one.

as per Simon in the bug I mentioned before.

@mnot
Copy link

mnot commented Aug 6, 2014

BTW, I don't think that's a particularly good definition of "identical request"; other headers that might change things include Accept-Encoding, Accept-Language, Prefer, If-Modified-Since, If-None-Match, Range, etc. etc.

@igrigorik igrigorik changed the title Caching grace period limitations Matching hint responses with requests Aug 14, 2014
@igrigorik igrigorik removed the bug label Aug 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants