-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precise specification of the DID Core Vocabulary #404
Comments
To put my money where my mouth is... I have created an initial draft for a DID Core Vocabulary and for a DID Core SHACL Shape. Both are based on my understanding of the vocabulary reading the spec; I may have also misunderstood it here and there. (Based on my comment earlier today the vocabulary separates a This is obviously not final but maybe it will help us to come up with a really precise specification of the vocabulary. Also, my experience/knowledge of SHACL is fairly shallow, although I could make sense of the spec based on my background in RDF. But if it becomes a "production" version we may want to talk to someone who is more of an expert than I am. If you want to test all this (and somebody should...) I used a Python-based SHACL implementation called pySHACL based on RDFLib. I installed it easily and worked out of the box. I used the following command line:
which means that the tool performed and RDFS inferencing on the data using the vocabulary before checking the shape (that is important to get the subclassing worked out). Note that I did not try to incorporate the JWK key structures into the vocabulary or the shapes. It is just an unknown class for now. Note also that this may be a good tool to create tests on DID documents... |
Is DID Core SHACL compatible with the JSON-only representation?... I think they are fundamentally incompatible because JSON only supports arbitrary properties that are not included in the registries. |
I have to think it over, but I am not sure. SHACL is a bit like JSON schema (actually, it would be interesting to create a JSON Schema from the SHACL). It puts constraints on properties that are in use, but it is silent on other properties. Isn't it similar to JSON? Putting another way: if we have a JSON file for a DID Document, we can turn it into JSON-LD in a trivial manner, we can check it with SHACL and, I believe, the possible validation errors would be valid for the original source, too. |
this is not true today, and a number of open PRs are seeking to make it even less true tomorrow. one of the major features for JSON-only is that is has no reliance on "schemas / registries"... I suspect a lot of the JSON-only representations would be incompatible with RDF triples, because if thats a thing people cared about, they would probably have just used JSON-LD.... I think members of this WG, including @dhh1128 @jricher @peacekeeper @talltree and @selfissued have a strong desire for the JSON representation to be capable of incompatible with RDF....and the flexibility that separate JSON representation enables is viewed as a benefit by some and a security / interop failure by others. @dhh1128 has some great comments advocating against an abtract data model for DIDComm here: decentralized-identity/didcomm-messaging#112 (comment) (which I agree with).
I've not seen a lot of work on JSON-only, but I have seen a lot of work trying to make it incompatible with JSON-LD.... this leads me to believe that a primary design goal of JSON-only representation is to support did document representations that are NOT compatible with RDF / linked data.... this incompatibility (flexibility) is viewed as a feature which is only possible to achieve by creating a new representation for JSON-only.... In short the argument of some goes like this:
I don't agree with these arguments, for a number of reasons, but its clear that many in this working group also don't agree with me :) |
@OR13, I try to keep away from the general, slightly ideological discussions... At the moment the DID Core document is based on (1) an abstract data model and (2) a manifestation of that data model in JSON-LD and JSON. (And possibly CBOR, let me put that aside for now, not being a CBOR expert at all.) Also, at the moment, the DID Core document is referring to terms like As I said in #404 (comment) we should have all the constraints on the model described in clear terms, using INFRA terminology, because that should be part of the abstract model. Ie, that the For the JSON-LD version, because, in effect, we define a LD vocabulary, we are supposed to provide a vocabulary file in RDFS and, because RDFS cannot express constraints, SHACL comes into play. This is just being good LD citizens, but it also gives us a proper, unambiguous specification of the vocabulary. For JSON, ideally, JSON Schemas could be used to express those same constraints. Except for the problems around JSON Schemas, as mentioned in my aforementioned comments. My off-hand remark about turning the JSON into JSON-LD was not meant as a general, fundamental step, just a very down-to-earth, practical one: if we do not have a JSON Schema, then one way of checking a JSON serialization of the model (which, at this moment, is still based on using the same property names as the JSON-LD one) is to that. That being said, I agree that it is not really clean and stirs up too many questions, so I would be fine having a (non-normative) JSON Schema for the same purpose. I am not very good at JSON Schema but, based on my experience, it should be easy to translate the SHACL constraints into JSON Schema. Both the SHACL and the JSON Schema can express the open-ended requirement, namely the fact that there may be additional properties added to the DID document, not condoned by this specification. I do not see that as a problem. |
Just jumping in to say whaaat?!? I though the point of the DID spec registries was to support interop between JSON implementations, because they have no other way to talk to each other about the terms the use except through a centralized registry. My recollection is that people pushing for JSON-only support were the ones pushing for the registries in the first place. Sorry for that slight tangent Ivan. Agreed we need a cleaner layout of the core vocab. I thought this was something that might happen in the Registries, but then everything got reshuffled so there was less differentiation between properties from DID Core and other places. I made a half hearted attempt at a handy listing at the start of the Core Properties section, but that obviously pales in comparison to what you're working on.. I hadn't thought that we'd use a different namespace from security was in question? https://www.w3.org/ns/did/ has been mentioned in various places for a while. It's more that DID Core uses some terms from security v1 (in a way that only matters to JSON-LD implementations as it is only apparent in the context) than security v1 being intended to cover everything for DID Core. |
I do not know. At this moment there is no trace of anything else but the security vocabulary. |
I know... surprisingly some of the proponents of creating the registries and an abstract data model desire to enable did representation that do not use the registries or the absract data model, and that are not RDF-esque... I am supportive of this, as long as we don't destroy JSON and JSON-LD common sense to achieve it. |
I mostly don't have a strong opinion about the question implied by the issue title, but I feel the need to comment on Orie's initial characterization of my thinking, since he mentioned me by github handle. He's not entirely wrong, but I want to nuance. :-) I am not opposed to a DID document being interpretable as JSON-LD, if I don't have to list a finger to make this true. I am opposed to going out of my way to make it so. The typical assertion that I hear from JSON-LD proponents at this point is: "Good news. You don't have to lift a finger. Just include I have never believed that JSON-LD-style extensibility is a good idea for DID docs; I think it complicates something that should be dirt simple. Less extensibility = more traction for the core. I was never in favor of creating registries; I accepted them as a tolerable but undesirable compromise. I would be happier to just extend JSON by putting new fields into it, documenting the update, and letting chips fall where they may. That's how the JSON or XML of early RESTful web services used to work: duck typing FTW. I viewed registries as something insisted upon by JSON-LD proponents. I'm surprised at the implication in this thread, that ADM proponents were behind the registries movement. Maybe I'm just misinformed... I'm not trying to argue a point here. I'm reconciled to the compromise we worked out in the F2F. I will make some of my DID method work JSON-LD compatible to broaden its appeal, and I will probably register some extensions (and corresponding JSON-LD contexts) to broaden their appeal, too. But I don't feel a need to register every extension, or to register at the beginning of an extension's lifecycle; I might register an extension only when it has enough gravitas to make interop important. That doesn't mean I don't care about interop, or that I'm actively trying to sabotage JSON-LD. It just means JSON-LD extensibility has little value to me WRT DIDs, so I'm minimizing the attention I pay to it. (And lest you think I'm down on JSON-LD, I feel quite differently about its relevance to VCs.) |
@dhh1128 wrote:
Haha, no way! The JSON-LD proponents were very much opposed to the registries -- they are unnecessary for JSON-LD. The registries were put in place by the JSON-only folks -- because they needed a place to specify terms so they could interop. JSON-LD doesn't need such a mechanism because it already has You may also be surprised to know that the "define the registry in JSON-LD" exists because the JSON-only folks then didn't show up (zero PRs submitted) to define JSON Schema and other stuff for the JSON-only representation.
Yep, that's the JSON-LD way -- lazy registration and progressive interop -- none of this heavyweight "registry in the beginning" stuff... but we made the compromise we did to placate the JSON-only people that wanted absolutely nothing to do with JSON-LD. Now, to be fair... the registry has grown on me a bit -- we were able to build a bigger tent because of it and the abstract data model. It does come in handy for representations that do not have a mechanism like ... but don't think for a second that the JSON-LD folks argued for a registry -- it was the opposite. The registry was a compromise that was made, that has ended up not being perfect, but brought more people to the table with the downside of additional complexity. |
Oh no! :(((( The DID Spec Registires is the DID Core Vocabulary... if it's not, we're duplicating a ton of work. We probably need to get a resolution on this quickly... don't want to generate a ton of unnecessary work for everyone... we need to converge on the concept above. |
@msporny the Spec Registries document is
Hence, in my view, the need of a formal vocabulary; that is what I did in Turtle. I have no intention of re-writing the Registries' text, that is for sure! |
We've come full circle! I fear I have wasted a tremendous amount of time trying to get the registries to work for multiple representations with mostly no help.... If I seem frustrated with the abstract data model, this is why. At this point, I don't understand why the registries were even created, they are not required for JSON or JSON-LD, and nobody contributes to them AND we appear to need to define all the core terms in them in DID Core anyway, in an abstract data model, with support for special types like URIs, Dates and Cryptographic Keys..... perhaps we should just start from scratch?
I need to see an INFA definition for every normatively defined did document property, resolution option, and resolver meta data, to believe that infra is a good idea.... In absence of that, I believe it must be removed from the spec.... if we are going to have an abstract data model, we must do it right... if we are not, we should not pretend we have an ADM.... strong consensus can lead to a lot of compromises that are confusing for spec readers / implementers.... Are the folks who proposed the ADM ready to do the PRs necessary to define every term in INFRA? If we can't get consensus on defining every term in INFRA, what happens to the ADM? |
This was done a while ago for every property in the DID Document. If there is a DID Core property that exists today that isn't defined in INFRA, that's a bug and should be fixed. Can you please point to a DID Core property that is not defined in terms of INFRA? It is true that the resolution/dereferencing options aren't described in INFRA, but that's fairly easy to do since they're all strings at a minimum, and these more complex types (URL, DateTime) if the group comes to consensus on including those. Otherwise, we'll just fall back to strings for everything resolution/dereferencing related. |
@msporny |
Not true, that keeps being asserted, but the DID Registries exist partly because of the desire to interop in JSON-only ecosystems. Granted, that's mostly theoretical at this point since everyone seems to be using JSON-LD... but the theory holds. You don't know if you can interop in a JSON-only world w/o a registry -- that's the path the JSON-only folks chose to follow for JSON-only interop. It also helps cross-representation interop because folks know what properties are used in a global sense. |
It's arguable that DID Core should specify those in INFRA, as those properties are really LD Security properties and should be defined over there. That said, if we want to define them in DID Core, it's easy enough to do in INFRA: publicKeyJwk - If specified, the value of the property MUST be a single map(INFRA) containing allowed properties according to RFC7517(JWK). The properties MUST NOT include those that are marked as "Private" in the "Parameter Information Class" column in the JOSE Registry. publicKeyBase58 - If specified, the value of the property MUST be a string(INFRA). The string is expected to conform to the Base58 Bitcoin encoding rules(BASE58-I-D). So, easy to do in INFRA. Also, suggest that we don't do that and just push definition off to the Security Vocabulary or particular Linked Data Security suite. What else do you need to see to convince you that we can use INFRA as the base for everything, Orie? I can continue to write definitions in here until we've exhausted all of them. |
Just reacting on the distress call of @OR13 :-) I have raised this issue, i.e., I may be responsible for the mess. I guess it is up to me to try to settle the confusion. TL;DR: I do not really think we have such a big problem as this may seem :-) The way I see the various building blocks is as follows:
Some of my comments in #401 and #403 were editorial in this respect: some of the keys (properties) are not properly defined as being part of the Core set of terms (or it is not clear that they are part of it), and they should be made editorially much more visible. The constraints may not be fully clear. All editorial stuff, see my comments there. That is on the abstract level. We then do have a separate section on the JSON-LD, JSON, and CBOR mapping to and from the ADM. It is there (modulo some editorial issues), and not really complicated; after all, INFRA already defines the mapping to and from JSON. My original comment in this issue was on the JSON-LD conversion. Indeed, JSON-LD means connecting our ADM and the property definitions and constraints with the Linked Data world. My statement is that for this case, if we want to be good citizens, we need more than what we have, namely, we need a proper RDF based vocabulary (and of course a separate I also think that it is worth doing something similar for the LD-less JSON world, and the obvious candidate would be to express the terms and their constraints in JSON Schema. I don't think it would be complicated. (Doing SHACL, as well as JSON Schema, also forces us to reveal all details of our vocabulary spec, so it helps us! And it is a service to the community, too.) |
I think I disagree with you, @msporny, but only on the last sentence above. Having re-read the spec last week, it is fairly clear that both |
Well, if we start defining those things in DID Core things become strange... because those are things that a Linked Data Security WG should be defining :). Our charter also, arguably, forbids us from working in this area: https://www.w3.org/2019/09/did-wg-charter.html
Given that Microsoft fought to have that language added to the Charter, I was surprised when @selfissued suggested that we should define those sorts of things. I'm fine either way, just pointing out that there be dragons. |
:-) But we are not defining a crypto mechanism. Crypto mechanism, in my book, would be if we defined a new crypto algorithm for the purposes of DID, for example, and that would be a bit no-no. We do not even define how we would express JWK, just saying "this is the way we refer to it". I think this is a very kind baby dragon:-) |
I don't disagree with you on that point (in general)... alright, well... you get to feed that baby dragon... it's yours, man. Let's just hope it doesn't grow up too quickly. :P |
I'm trying to figure out if there is anything that needs to be done here. Yes, it would help to have formal definitions of everything, but that said, it doesn't seem like people are having that hard of a time implementing the specification as it stands today. I suggest we make precise definitions a stretch goal and if we don't meet that stretch goal, that's fine. At present, everything is defined in JSON-LD and INFRA. @jonnycrunch is working on CDDL. No one really contributed to the JSON Schema stuff. I suggest all of this stuff is non-normative helpful stuff that we don't need to put in the critical path for CR or Proposed Rec. |
I do not think we can jump to the conclusion. We do not have a comprehensive test suite that would include all the various constraints on the terms that we may or may not have, and we have no idea whether implementations arrived through the same conclusions based on a specification that does not make the term definitions, and its constraints, crystal clear. See my note at the end of my comment, b.t.w.
I do not think I agree. Where would you want to have a precise definition if not in the standard?
I would not agree with "is defined in JSON-LD". There is a generic narrative but no real formal definition. By using JSON-LD, this specification defines a vocabulary in RDF/Linked Data, and the proper way of doing that is to define the vocabulary (e.g., to make it very clear what we define in the spec and what is defined, possibly, elsewhere). At the minimum, something like the RDF vocabulary draft I created should be updated, checked, and published alongside the spec at the right namespace URL. I like CDDL and, if that works, I am happy to use that for a formal specification of all the constraints. In that case the draft SHACL shape may indeed be a "non-normative helpful stuff". But I know that while I was writing the shape down I realized I was not sure about the constraints, in fact, because the spec was not clear (note that @jonnycrunch reported similar issues). From the spec's point of view I would actually consider both the SHACL and @jonnycrunch's CDDL drafts as very special implementations, and that contradicts your statement whereby "it doesn't seem like people are having that hard of a time implementing the specification as it stands today" |
What we need is the following:
|
Does this mean w3.org/ns/did/v1 ? The RDF definition file can be at w3.org/ns/did/v1 as well, with conneg. And I would propose the file itself lives in the DID spec registries repo (eg. with the JSON-LD context). I don't think this needs to hold up CR. I think the rest of this issue is about making sure there are no terms mentioned in the spec prose which aren't actually defined in the spec. The rewrite of the Verification Methods section should sort that out once and for all (see #423 and #240). Then we need to highlight in any examples where non-core terms are used (for illustrative purposes), which has its own issue: #447. Finally it sounds like a summary table of the core properties and their constraints (in INFRA, not rdf) would be helpful. At the start of the core properties section there is a bullet list of them to help navigate. This could be turned into a table with a little more information - would that help @iherman? I'm happy to put a PR in for that if so. All that to say: I think the critical part of this issue is being covered by other issues, and what remains is editorial. I re-read this thread to make sure I didn't miss something, but let me know if that doesn't sound right. |
I am not sure that this navigational aid can be easily extended. To give you a specific example, the But we may have to experiment with various approaches. Note, b.t.w., that we have agreed to have a topic call on this, so it is probably better to wait until that call before you propose a PR. |
Creating another parallel normative representation of the spec in RDF is an unnecessary large body of work and will almost certainly result in inconsistencies between the actual spec and the RDF. Therefore, we should decide not to do any RDF work. If the spec text is unclear in particular ways, we should absolutely make clarifications in the normative text. Trying to do this work by creating a parallel description in RDF is unhelpful and unnecessary. |
@selfissued should we use any kind of security tooling, like CDDL, JSON Schema, SHACL or should we go out of our way not to define the shape of the ADM in anything other than INFRA and hand crafted normative statements? I don't care about RDF... I do care about relying on english language interpretation for security. |
As stated in https://w3c.github.io/did-core/#conformance , "As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative." Therefore, there are many many normative statements in the spec that do not use RFC 2119 language. Our tests should therefore test important normative statements not using RFC 2119 language as well as those that do. |
@selfissued sure, my point is that writing tests for english language sucks, and writing tests for JSON Schema, CDDL, or SHACL is a much better solution! :) |
We define the DID document serialization in, among others, JSON-LD. JSON-LD stands for Linked Data, and JSON-LD is, in fact, a JSON based syntax for RDF. We are already there. What we are discussing (beyond the issue of properly representing constraints, which is to be done regardless of RDF) is to do the work properly, ie, if we use RDF then the vocabulary should be described in a format that the Semantic Web community expects to have. Otherwise, we would do a sloppy job. The real issue is not RDF. The real issue is to express all the terms and their constraints in an uneqivocal way. Creating the RDF formatted vocabulary is not a real issue. |
Oops - my last comment was meant to be in #384. See #384 (comment) . |
The issue was discussed in a meeting on 2020-12-15
View the transcript2. DID Core vocabSee github issue #404. Brent Zundel: this was raised by ivan, folks have commented. Manu Sporny: ivan this is mostly, the question to you - what are the expectations here?.
See github pull request did-spec-registries#170. Orie Steele: I linked a PR. I had a conversation with amy related to this on the did spec registries. Ivan Herman: I was probably not clear in what I wrote, but manu, unfortunately, this is not what I meant. Manu Sporny: that is helpful, thanks ivan.
Ivan Herman: as I said I am perfectly fine to use cddl for that. If that's the way we do it, that's fine.. Manu Sporny: no. Ivan Herman: it should be. Jonathan Holt: the cddl spec itself is normative and it extends the abnf rules, it is a normative description of the constraints and type definitions.
Jonathan Holt: I think cddl tries to not boil the ocean in that way, it really is a constraint satisfaction which is mathematically proven. Michael Jones: I push back on the idea that rdf is necessary or even useful. Brent Zundel: if we could have those comments occur in the issues I think that would be best. |
As discussed on the call on 2021-01-05, current actions are:
|
@msporny I do not think this needs a special call any more. Should we remove the label? |
Admin comment in the issue status. With the merge of #536 (and closure of #528) I would think there is no reason to keep the However, I would prefer not to close this issue, because w3c/did-extensions#182 and w3c/did-extensions#183 are also related to it; in other words, we should close the current issue only if those two PR-s are, in one way or other, handled and settled. |
I'd rather not remove the flag as that's how we're keeping track of all open issues for this spec... we could downgrade to p3, or more preferably, get 182 and 183 merged so we can close this issue. |
I let the higher authorities decide :-) |
With the merge of w3c/did-extensions#183 I believe this issue can be closed. @msporny ? |
Agreed, closing. |
(I decided to spin-off this issue from #403, because it becomes a bit intricate and having it as a separate issue may be helpful.)
At the moment the Core spec contains property specifications (
authentication
,alsoKnownAs
, etc) in English prose. These are scattered all over the place, and that is confusing. There is no one place where all the terms, classes, and constraints are properly defined, are referencable, and possibly easily usable for testing or by implementations. I think it is necessary to provide a clean, crisp, and unambiguous specification as part of DID-Core. This specification should account both the JSON-LD (ie, Linked Data) and pure JSON world, as well as CBOR.The only reference I found at the moment is The Security Vocabulary which contains the DID Core terms (although it is incomplete,
alsoKnownAs
is, for example, missing). This is the vocabulary referred to (via thehttps://w3id.org/security#
namespace) from the JSON-LD@context
, i.e., this is where one get by "following one's nose". (Note that this set of references is only meaningful for JSON-LD.)But there are problems with the Security Vocabulary:
I believe what may want to do:
https://w3id.org/security#
; it should use a namespace URI that would return the normative RDFS file of DID-Core if dereferenced as an RDF graph. That may create some hiccups v.a.v. the current security vocabulary, though.I believe (1) above is the most important for the integrity of the DID Core specification. The other items may be non-normative, but should definitely be considered as stable, be part of the Working Group's output.
The text was updated successfully, but these errors were encountered: