-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement access control lists for gateway #1551
Conversation
21e5e32
to
6d4b7b1
Compare
|
||
if i.config.DenyList != nil && i.config.DenyList.Has(k) { | ||
w.WriteHeader(451) | ||
w.Write([]byte("451 - Unavailable For Legal Reasons")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not always. we might have to block things that are not illegal (code of conduct on the service we run on our own gateway infra). probably should depend on the blocked thing... ok to merge, but this is not correct in the long run
this PR changes the behavior. previously, allowing "$foo" on the webui, would allow "$foo" and all its children. BUT the children would not be able to be viewed individually, only through the parent:
we may want the old behavior. or we may want the new one. im not sure. anyone else have opinions on this? |
@mappum @diasdavid o/ thoughts re
|
the problem with the old behaviour is that if you block X, all i have to do is make Y that has X as a child, and I can access it again. It didnt really provide any sort of real 'blocking' |
no-- on the webui it was indeed blocked, because nothing except those things rooted at the allowed roots would work. the reason i bring it up, too, is that there are implications to how HTTP works. for example, i may want to allow, root X, which has Y at I'm leaning towards: "allowing/denying roots" as the right idea. you have to look through and make sure everything is ok to be used that way, but there's no questions otherwise. |
are you talking about whitelists or blacklists? there are different implications depending on that context. In the case of a whitelist, what you say is true. In the case of a blocklist, what I'm saying is true. |
i mean the webui "allowlist". |
okay, the more i think of this, the more i think we're going to have to treat each list differently. the "allowlist" is going to have to use the root checking like you say, but the "do-not-allowlist" is going to need to use the technique i'm using. |
func loadKeySetFromURLs(urls []string) (key.KeySet, error) { | ||
ks := key.NewKeySet() | ||
for _, url := range urls { | ||
resp, err := http.Get(url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woah, no. we should be caching this locally. the http thing should be only to update the list. if list host is unreachable, then what? no gateway? no blocking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where should we cache it to? just inside $IPFS_PATH
?
should we name them after the url used to fetch them? or just blocklist.json
and allowlist.json
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think they should work kind of like branches in git.
- there's a remote endpoint with a bunch of named lists (a
tree
of lists if you will) - fetching them locally tends to use the same name, but can rebind them.
|
not suggesting to change it to this, this is only for discussion:right now, the allowlist and denylist only stack once. other filter systems allow users to compose a stack of filters, allowing expressions like:
and so on. (this makes sense for IP cidr filters too) |
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
@jbenet question about caching the lists. how should they be stored? as one list for allow and one list for deny on disk? and how often/when should they be updated? |
License: MIT Signed-off-by: Jeromy <[email protected]>
6d4b7b1
to
13374c3
Compare
@whyrusleeping re "how they should be stored": They should be ipfs objects. we can import them from JSON. |
@jbenet we already talked about this... thats the reason we arent fetching them from ipfs in the first place. the gateway is set up during node construction, fetching objects before the node is completed is a hassle. |
We can fetch ipfs objects over HTTP just fine. (to clarify, i'm suggesting hosting the lists as ipfs objects (either in json or cbor (or both! |
Can we make sure that we don't have to comply with a DMCA that shuts down the denylist? |
What I'm getting at is, it'd make the denylist subject to the same absurd theater that it's supposed to make transparent |
@jbenet but if we fetch ipfs object over http, what exactly are we caching? |
the object. cache it locally because you're not guaranteed to be able to reach the host every time you launch IPFS. Basically: func start() {
tryDownloadingNewDenylist()
loadDenyList()
}
func tryDownloadingNewDenylist() {
// do http request.
// if successful, store as an ipfs object locally.
}
func loadDenylist() {
// load local denylist ipfs object
} |
@lgierth i think we'll be fine with distributing a list of hashes. we generate the denylist, so it's technically our copyright? |
@lgierth whats in the |
@whyrusleeping The response is an array of notice objects, each consisting of a URI for more information on that notice, and a list of keys to block: [ { "uri":"http://dmca.ipfs.io/2015-08-03-foobar",
"keys":["Qmfoobar","Qmsomething","Qmbarbaz"] } ] As per our chat yesterday, we can just flatten that, because the concept of a notice just isn't needed on the gateway's end. ["Qmfoobar","Qmsomething","Qmbarbaz"] Would it be a good idea to respond with a URI template for the for-more-information-click-here link? |
@jbenet so the flow i think i'm getting is:
question: where does the |
also, when should we check for updates to the lists? |
this o/ wont work because after getting a list you would never update again. i think we need: if updateCondition() {
if list, err := getListViaHTTP(url); err == nil {
dag.Add(list)
}
}
list, err := dag.Get(listKey)
if err != nil {
list = emptyList
} |
|
I think we shouldn't go over an hour with a stale one. another question is that long-running daemons should poll over time to remain up to date. this is way easier if we make the lists just return ipfs refs. see below. |
(could use the main api endpoint to return a head (i.e. a hash of the head) and then grab the head from the global http gateway itself. so
could be just hashes, which we can then get with:
that way, the json / cbor / whatever encoding question is handled by the API. (yay!) we want to make these changes to propagate fast through the network |
Should we go with something like this? > GET https://gateway.ipfs.io/refs/lists/denylists/dmca HTTP/1.1
< HTTP/1.1 302 Found
< Location: /ipfs/$hash
<
< $hash
type DenyList struct {
Objects []Link // all DenyObjects
}
type DenyObject struct {
Object Link // the object that's supposed to be blocked
Notice Link // the dmca notice object
}
type DenyNotice struct {
Value Data // opaque value
} Get all > GET https://gateway.ipfs.io/ipfs/$deniedHash
< HTTP/1.1 451 Unavailable For Legal Reasons
<
< <a href="/dmca/$deniedHash">Show DenyNotice object,
< and the DenyObjects linking to it</a> |
@lgierth that LGreatTM. One minor thing, for the |
@jbenet the gateway would have to keep the reverse-links from Object to DenyObject in order to get the notice. The idea above was that the little gateway-dmca-denylist daemon would take care of that, so that we don't have to introduce any logic to the gateway apart from basic denying. Or is there an API function for getting objects linking to a given hash? |
@lgierth we could probably prebuild all the pages statically every time we add to the list. and maybe we could link to a server we run then. dont want to put tons of |
gateway-dmca-denylist now builds this kind of list: https://ipfs.io/ipfs/QmRER7erZxU63huYgSBryGhKrfHdkDkVjwQTd8RD4RdSW5 ... which is a unixfs structure, which I figured is the simplest to implement and look at (thanks to the dir-listing). It fans out based on the keys associated with each notice, so that e.g. a notice with 3 keys will result in 3 objects linked in the denylist. Each of these objects links to the rendered notice, and the object to be blocked as identified by the key. The rendered notice is the same for all these.
|
@lgierth feedback. Looking good! what if instead of:
instead do:
this could:
Or is there a reason to have the hashes accessible from the root? is it about finding them easily? (could still do find over levels, just a bit harder. but if find is a problem, then we probably have to land unixfs dir sharding) there's also something to be said for:
but this index can be built later anyway
|
Regarding the hashes being present at all, I'm only appending them to the link name so that I don't end up with name clashes. These suffixes could as well be a counter ( Regarding this being a flat list, I thought that'd be easier to parse on the client side. I'm happy to make it
Sure |
@whyrusleeping wdyt? |
@lgierth i can probably handle it. |
closing, rebasing and making this work with todays codebase would require more work than rewriting it from scratch |
License: MIT
Signed-off-by: Jeromy [email protected]