Design discussion #1

bigs · 2018-09-10T21:48:12Z

Opening a thread for the discussion of the control API for our daemon.

Control API

Responsibilities

Peerstore management
Stream creation (connect if connection does not exist)
General information retrieval
- Open connections
- Open streams
- Peerstore size
Register protocol handlers

Implementation details

Two solid options:

HTTP/JSON
- Requires polling for incoming connections
JSON over TCP
- Bidirectional, so we can push incoming stream notices to users

Stream Proxy

Responsibilities

Proxy streams to end users, taking care of all security (secio, TLS) and multiplexing (yamux, etc)
Clients should be able to close streams (i.e. close a file handle)
Incoming streams on registered protocols should create new streams

Implementation details

Editor's note: I think we can get away by polling the filesystem/shared memory where our streams are created as opposed to polling the control API, which would make things simpler.

Unix sockets
- One socket per stream
- Organized on filesystem by connection id
- Seems Windows support has landed
shmem
- Likely fastest implementation
- Certainly most complex, least platform agnostic
  - There exists a Windows alternative, but it has a separate API. Perhaps something to be dealt with in a golang library
Proxy Filesystems (FUSE, WinFSP)
- Heavier than unix sockets

bigs · 2018-09-10T21:48:44Z

i'm feeling pretty good about HTTP/JSON API with a unix socket based stream manager, where the client polls the configured directories in the filesystem for new incoming sockets

vyzo · 2018-09-11T16:46:56Z

Per discussion with bigs on zoom:

We discussed a symmetric unix domain socket stream protocol
- the daemon listens to a unix socket, and the client initiates streams by opening connection and issuing a protocol header
- the client listens to a unix socket to provide stream handlers; the daemon connects back on stream open and issues a protocol header
The daemon exports the control interface through http/json, as is the simplest and most flexible approach to begin with. We can later implement a binary control protocol over unix socket as well.

The protocol header must contain at minimum:

a disambiguator, to allow later implementing the control protocol multiplexed in the daemon
the peer ID and multiaddr
the protocol for the stream

We settled on using delimited protobuf for the protocol header, as this saves the need to write custom serializers.

Stebalien · 2018-09-12T05:07:58Z

Before going into design considerations too much, let's flesh out our motivations so we get on the same page: #3

Notes on the current discussion:

Serialization Format: I'd like to make a somewhat bold proposal: no JSON. JSON is lacking a "bytes" type and this has caused no end of trouble. How do you feel about mandating CBOR?
Multi-Tenet from day 1: We'd like much of this daemon to eventually move to the kernel (or, at least, a system daemon) and it turns out that adding in multi-user support later is rather tricky.
100% libp2p: I'd like to at least entertain the idea of going 100% libp2p; that is, no HTTP API. This would make everything we do network transparent (in theory). We may need to have some service expose an HTTP API for simplicity but we should at least consider a micro-kernel like architecture where that's a separate daemon.

vyzo · 2018-09-12T07:13:18Z

If we are going JSON-less, let's not add another format -- we can use protobufs for the control protocol and multiplex in the UNIX socket.

vyzo · 2018-09-12T07:27:29Z

For multi-tenant applications we might have the issue of who's handling the streams -- there can be only a single stream handler for each protocol.
I think it makes more sense for each application to run its own daemon; the application can also be composed of multiple processes.

raulk · 2018-09-12T16:08:43Z

Couple of points here:

Another way to view what we're doing is stream virtualisation.
Supposing that the daemon is exposing streams over unix sockets and SHM, we should publish lightweight client bindings/protocol SDKs in different languages. We really don't want users (re)implementing the plumbing across the board to attach their apps to our our local, virtualised libp2p transports.
I echo @Stebalien's third point about keeping it 100% libp2p. I'd advocate for the daemon to be started with the listen multiaddr for the control plane, that only accepts local transports, e.g. --listen /unixsocket/[/var/unix/...], along with an option to enable the --http layer, outputting a warning that interface is only for development/testing/admin.

For the multi-tenant mode, the master control plane could accept only two commands: new, attach.

With new an app starts a new session, does an encryption handshake, and receives a private socket/shm assignment for its app control plane (which is encrypted for that app only), along with a token to re-attach in the future.
With attach, an existing app could reattach by providing the token.

Just some initial brainstorming, really.

vyzo · 2018-09-12T16:26:36Z

I think that requiring clients to implement yamux/secio is a huge burden for bindings implementors (speaking as one :)

raulk · 2018-09-12T16:50:03Z

Agree. Bindings should not perform multiplexing, that's precisely what the daemon does for them, tunnelling each stream onto a physical mapping atop a local transport. So there'd be a 1:1 mapping between a local resource (e.g. shm, socket) and a backing stream.

Regarding secio, if we want multitenancy and isolation in the future, I guess we'll need an encryption mechanism to avoid apps cross-reading streams. But yeah, that complicates binding implementations. Alternatively, we could leverage OS resources like cgroups to provide the isolation.

Stebalien · 2018-09-12T21:14:53Z

I'd advocate for the daemon to be started with the listen multiaddr for the control plane, that only accepts local transports

Technically, we don't even need to do that as long as we can whitelist (although we may want to anyways).

I think that requiring clients to implement yamux/secio is a huge burden for bindings implementors (speaking as one :)

My thinking here is that we'd use a super special local-only transport. That is, we build a SHM/unix domain socket transport that does 90% of the work on the server side. We could even have multiple: a simple one that uses a new file descriptor per stream and a fancy one that uses memory mapping and a single socket for control information. We also don't need to do any secio/encryption as it's all local and privacy/authentication can be enforced by the kernel. The real tricky part here would be key management because we currently expect all "peers" to be identified by a public key. We could do an authentication round using signatures but that feels like overkill.

The only tricky part here would be peer IDs but, in theory,

vyzo · 2018-09-13T15:08:58Z

Let's not have so many words and no code!
Initial implementation: #4

cheatfate · 2018-10-04T03:01:37Z

I'm sorry, but is SHM transport is really needed?
In situation where

client -> shm -> libp2p-daemon -> network

libp2p-daemon become a bottleneck. SHM will allow clients to send data in much bigger speeds then libp2p-daemon will be able to handle and send to network, because network is much more slower then SHM.

In such case libp2p-daemon will needs to hold big buffers to keep all the incoming packets, or lock incoming clients until it will be able to send data over network.

bigs · 2018-10-05T14:07:37Z

@cheatfate the spec has actually been formalized (in its current state) in SPEC.md. shm was thrown around as a more direct, efficient, method of IPC. we're not approaching it at the moment.

raulk · 2018-10-05T14:30:06Z

Yeah, as @bigs says SHM is just on the radar, but not an immediate priority. We're aware of the complexity, and it'll warrant much experimentation.

I think your remark boils down to needing a mechanism for backpressure. Unix domain sockets inherently provide this (I believe). With SHM, it'll need to be part of the protocol agreed between both processes.

I have a lot of investigating to do before I can provide better answers, but for now SHM is just in the wishlist ;-)

bigs · 2018-10-05T14:59:27Z

yep, exactly. i envision back pressure as existing through use of something like a ring buffer that can mark messages that have/have not been read.

…

On Oct 5, 2018, 10:30 AM -0400, Raúl Kripalani ***@***.***>, wrote: Yeah, as @bigs says SHM is just on the radar, but not an immediate priority. We're aware of the complexity, and it'll warrant much experimentation. I think your remark boils down to needing a mechanism for backpressure. Unix domain sockets inherently provide this (I believe). With SHM, it'll need to be part of the protocol agreed between both processes. I have a lot of investigating to do before I can provide better answers, but for now SHM is just in the wishlist ;-) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

bigs · 2018-10-16T19:29:38Z

at this point, implementation details are starting to settle a bit. i'm going to close this conversation for now.

bigs assigned vyzo, bigs, Stebalien, raulk and vasco-santos Sep 10, 2018

Stebalien mentioned this issue Sep 12, 2018

Establish control spec #2

Merged

raulk mentioned this issue Sep 25, 2018

Roadmap #9

Merged

bigs closed this as completed Oct 16, 2018

Stebalien mentioned this issue Dec 20, 2018

Possible future support for libp2p intra-process? libp2p/libp2p#61

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design discussion #1

Design discussion #1

bigs commented Sep 10, 2018

bigs commented Sep 10, 2018

vyzo commented Sep 11, 2018

Stebalien commented Sep 12, 2018

vyzo commented Sep 12, 2018

vyzo commented Sep 12, 2018

raulk commented Sep 12, 2018 •

edited

Loading

vyzo commented Sep 12, 2018 •

edited

Loading

raulk commented Sep 12, 2018 •

edited

Loading

Stebalien commented Sep 12, 2018

vyzo commented Sep 13, 2018

cheatfate commented Oct 4, 2018

bigs commented Oct 5, 2018

raulk commented Oct 5, 2018

bigs commented Oct 5, 2018 via email

bigs commented Oct 16, 2018

Design discussion #1

Design discussion #1

Comments

bigs commented Sep 10, 2018

Control API

Responsibilities

Implementation details

Stream Proxy

Responsibilities

Implementation details

bigs commented Sep 10, 2018

vyzo commented Sep 11, 2018

Stebalien commented Sep 12, 2018

vyzo commented Sep 12, 2018

vyzo commented Sep 12, 2018

raulk commented Sep 12, 2018 • edited Loading

vyzo commented Sep 12, 2018 • edited Loading

raulk commented Sep 12, 2018 • edited Loading

Stebalien commented Sep 12, 2018

vyzo commented Sep 13, 2018

cheatfate commented Oct 4, 2018

bigs commented Oct 5, 2018

raulk commented Oct 5, 2018

bigs commented Oct 5, 2018 via email

bigs commented Oct 16, 2018

raulk commented Sep 12, 2018 •

edited

Loading

vyzo commented Sep 12, 2018 •

edited

Loading

raulk commented Sep 12, 2018 •

edited

Loading