Skip to content
This repository has been archived by the owner on Aug 11, 2021. It is now read-only.

Clarify that ipld.get(cid) returns an object with keys that are resolvable #141

Closed
Mr0grog opened this issue Aug 7, 2018 · 7 comments
Closed

Comments

@Mr0grog
Copy link

Mr0grog commented Aug 7, 2018

In ipld/js-ipld-ethereum#25 @olizilla ran into some issues with the node representation that is returned from ipld.get(cid) for Ethereum objects. After I looked into it, it turns out the same issue exists for the Bitcoin and Zcash resolvers.

Specifically, when calling ipld.get(cid), some resolvers (dag-pb, dag-cbor, git) return an object where the keys represent resolvable paths. Other resolvers (bitcoin, ethereum, zcash) return an object where the keys don’t represent resolvable paths (it’s just the parsed representation from whatever underlying library is being used). For example:

// dag-cbor
const cborCid = new CID('zdpuAnhvECCigEM92DCZVwFguAHTSg1yJcPLyPdsJsVTYRuXB')
ipld.get(cborCid, (error, results) => {
  console.log(results)
})

// outputs:
{
  child: { '/': Buffer },
  title: 'A simple example of an IPLD node'
}

// and each key in that object represents a resolvable path:
ipld.get(cborCid, 'title', {localResolve: true}, (error, results) => {
  console.log(results)
})
// outputs:
'A simple example of an IPLD node'

// or
ipld.get(cborCid, 'child', {localResolve: true}, (error, results) => {
  console.log(results)
})
// outputs:
{ '/': Buffer }

// But for ethereum:
const etherumBlock = new CID('z43AaGEvwdfzjrCZ3Sq7DKxdDHrwoaPQDtqF4jfdkNEVTiqGVFW')
ipld.get(etherumBlock, (error, results) => {
  console.log(results)
})

// outputs:
{
  raw: [/* array of buffers */],
  _fields: [
    'parentHash',
    'uncleHash',
    'coinbase',
    'stateRoot',
    'transactionsTrie',
    'receiptTrie',
    'bloom',
    'difficulty',
    'number',
    'gasLimit',
    'gasUsed',
    'timestamp',
    'extraData',
    'mixHash',
    'nonce'
  ],
  toJSON: function() {},
  serialize: function() {},
  parentHash: Buffer,
  uncleHash: Buffer,
  coinbase: Buffer,
  stateRoot: Buffer,
  transactionsTrie: Buffer,
  receiptTrie: Buffer,
  bloom: Buffer,
  difficulty: Buffer,
  number: Buffer,
  gasLimit: Buffer,
  gasUsed: Buffer,
  timestamp: Buffer,
  extraData: Buffer,
  mixHash: Buffer,
  nonce: Buffer
}

// But many of these keys are not resolvable. This fails:
ipld.get(etherumBlock, '_fields', {localResolve: true}, (error, results) => {
  console.log(results)
})

// And this is *missing* valid, resolvable paths. This succeeds, even though `parent` is not in the above object:
ipld.get(etherumBlock, 'parent', {localResolve: true}, (error, results) => {
  console.log(results)
})

After talking with @vmx on Friday, we decided it was important that the object you get back is a reasonably accurate representation of the IPLD conception of the node — that is, the keys in the on the object should also represent paths you can resolve.

We need to make this requirement clear in the README or some other documentation.

Note: this is somewhat related to ipld/ipld#44, but after discussing with @vmx and @mikeal on Friday, I think we can treat it separately.


One other open question is that it might be important to have a way to get the underlying representation (either the raw bytes or the parsed representation from which we created the IPLD-focused representation [like what the Ethereum resolver currently returns]). What would be the best way to do that? We had three ideas:

  1. Create a separate API, like ipld.getRawNode(cid, callback).

  2. Return the raw bytes (and maybe the parsed representation) as special symbol keys so they don’t get serialized, but are still available. e.g:

    ipld.get(etherumBlock, (error, results) => {
      console.log(results)
    })
    // outputs:
    {
      [ipld.rawBytes]: Buffer,
      [ipld.parsedData]: {},  // The object we currently get from ipld.get()
      bloom: Buffer,
      coinbase: Buffer,
      difficulty: 11966502474733,  // A number or BigInt, not just bytes
      parent: CID,  // or {'/': 'hash'} -- see https://github.com/ipld/ipld/issues/44
      // etc.
    }
  3. Return the underlying object (like today) if just getting the CID, but return an IPLD-focused representation if getting the path /. e.g:

    // Returns the same thing as today
    ipld.get(etherumBlock, (error, results) => { console.log(results) })
    
    // Returns a nicer, IPLD-focused representation where the keys are resolvable paths:
    ipld.get(etherumBlock, '/', (error, results) => { console.log(results) })

    This is probably the least desirable because it’s not super clear and very easy to get wrong.

@vmx
Copy link
Member

vmx commented Aug 7, 2018

Thanks @Mr0grog for the detailed analysis. I'd like to add where things come from. The underlying library for js-ipld are the IPLD Formats. There we have a util.deserialize() call, which returns the object with some internal representation (this is what you see with Ethereum above). Then there's also the resolve.resolve() call. That one returns a structured object (like CBOR above) when you pass in / as path (please note that this is in the spec, but not all formats might have been updated yet).

So we could add to the IPLD Format spec that passing in an empty path (null/undefined/'') will return the same as / (although I prefer having a minimal API, so I might just disallow passing an empty path in). I originally thought returning the internal representation when passing in an empty path is a good idea, but I can't recall why. If you want that, you can always use util.deserialize() anyway.

@Mr0grog
Copy link
Author

Mr0grog commented Aug 7, 2018

I guess I might have misunderstood your points from our discussion on Friday. I’m sorry if I wrote things wrong here (it was meant to be a summarization of what we had agreed on).

Then there's also the resolve.resolve() call. That one returns a structured object (like CBOR above) when you pass in / as path (please note that this is in the spec, but not all formats might have been updated yet).

OK, so from that perspective, a big part of the problem in ipld/js-ipld-ethereum#25 is that it violates this part of the spec, right? (I’m not sure about bitcoin and zcash, but they look like they might fail if the path is /.)

I had thought you were talking about the js-ipld interface, not the interface for a resolver when you mentioned the / on Friday. Maybe we were talking about totally different things.

The two concerns I was really trying to get at here are:

  1. What does ipld.get(cid) and ipld.get(cid, path) return? It seemed like we were saying on Friday that it should return an object where all the keys are the resolvable paths. (I’m not worried here about whether we format links as {"/": "<cid string>"} or {"/": "<cid instance>"} or <cid instance> or whatever.)

  2. Are we documenting the IPLD resolver interface correctly to make sure the above happens?

    For example, if ipld.get(cid) should return an object where the keys represent resolvable paths, then is it a bug here in js-ipld that it simply returns the result of resolverForCid.util.deserialize(rawBlock) or should we change the requirements of util.deserialize() to say it should return something else?

We could add to the IPLD Format spec that passing in an empty path (null/undefined/'') will return the same as / (although I prefer having a minimal API, so I might just disallow passing an empty path in).

I do feel like it would be an improvement in the API if the value of the path being resolved didn’t affect the sort of value you expect to be returned, so I think this would be good. That said, I’m mostly just concerned that the interface requirements are clear, so that we can say exactly what is wrong (or if something even is wrong) in a case like ipld/js-ipld-ethereum#25. Right now, it’s not clear whether what’s happening is correct, whether js-ipld has a bug, or whether js-ipld-ethereum has a bug (or both!).

@vmx
Copy link
Member

vmx commented Aug 7, 2018

I had thought you were talking about the js-ipld interface, not the interface for a resolver when you mentioned the / on Friday. Maybe we were talking about totally different things.

I think we talked about the same thing. I just brought it up as I'd like to have this issue solved properly on all levels, not only on the highest level (js-ipld) API.

OK, so from that perspective, a big part of the problem in ipld/js-ipld-ethereum#25 is that it violates this part of the spec, right? (I’m not sure about bitcoin and zcash, but [they look like they might fail if the path is /](https://github.com/ipld/js-ipld-bitcoin/blob/b75bb649a930d2149bb722b46a6eb6bf0fbdd4f8

That is correct. The implementations currently don't follow the spec. This is tracked under ipld/interface-ipld-format#34.

@Mr0grog
Copy link
Author

Mr0grog commented Aug 8, 2018

I think we talked about the same thing. I just brought it up as I'd like to have this issue solved properly on all levels, not only on the highest level (js-ipld) API.

Got it 👍

So between this and what you wrote in ipld/js-ipld-zcash#20, I think this also means we have a bug in the code here — if ipld.get(cid) should return an object where the keys represent resolvable paths, js-ipld is doing the wrong thing when it simply returns the result of resolverForCid.util.deserialize(rawBlock) when there’s no path. Should I make an issue for that?

@vmx
Copy link
Member

vmx commented Aug 8, 2018

@Mr0grog Thanks for looking into the code, I did as well, but somehow misread it. You are right. So yes, please open an issue. To double-check we have the same understanding: The issue is that ipld.get() should always call resolver(), which then will always return a proper object with resolvable paths (once ipld/interface-ipld-format#34 is fixed).

@Mr0grog
Copy link
Author

Mr0grog commented Aug 8, 2018

The issue is that ipld.get() should always call resolver(), which then will always return a proper object with resolvable paths (once ipld/interface-ipld-format#34 is fixed).

👍 yep, we are thinking the same thing here. I opened an issue for that in #142.

@daviddias daviddias added the status/ready Ready to be worked label Aug 25, 2018
@vmx
Copy link
Member

vmx commented May 8, 2019

With version 0.23 get() returns an object with keys that match the resolvable paths. Though get() might have additional methods attached. This means that get() possible returns a superset of a resolve('/') call (or the same).

@vmx vmx closed this as completed May 8, 2019
@ghost ghost removed the status/ready Ready to be worked label May 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants