Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.
Greg Burns edited this page Feb 14, 2017 · 37 revisions

Welcome to the preview of DPS for IoT

Note: DPS is currently released as a technical preview for evaluation. It is very much as work in progress and is not intended for commercial deployment at this time.

Background

Distributed Publish Subscribe for the Internet of Things, which we will refer to as DPS in this document, is a new protocol that implements the pub/sub (publish/subscribe) communication pattern. The pub/sub pattern for device to device communication is simple and powerful. There are several existing pub/sub protocols seeing heavy use in IoT applications, perhaps most notably MQTT and DDS but there are numerous other. Two characteristics of pub/sub that make it attractive for IoT uses cases are support for loose-coupling between publishers and subscribers, and inherent support for point-to-multipoint messaging. There are generally two implementation approaches: brokered (e.g. MQTT), or multicast (e.g. DDS). In brokered pub/sub systems publishers and subscribers connect to a centralized server that routes publications to matching subscribers. In a multicast pub/sub system subscribers receive messages from all publishers and selectively forward matching publications up to the application. The disadvantage of a brokered approach is that the broker is single point of failure, must be 100% available, and scales linearly with bandwidth and processing capability of the broker. Also all messages do a round-trip through the broker which puts a lower bound on communication latency. Multicast pub/sub systems are hard to scale beyond a single subnet and work much better over wired than wireless networks.

Distributed Publish Subscribe as the name implies is a fully-distributed pub/sub framework. There is no broker, devices or applications (we will just call them nodes) running the DPS protocol form a dynamic multiply-connected mesh where each node functions as a message router. The DPS framework supports a topic string syntax that will be very familiar to MQTT users and also supports MQTT-like retained messages. The mesh is boot-strapped using IP multicast, a directory service, or by explicit URL. The DPS protocol is light-weight and amenable to implementation on very small devices such as sensors that primarily publish data. The DPS architecture is well suited for applications that leverage edge computing in combination with cloud-based analytics.

Overview

Superficially DPS looks like a broker based pub/sub protocol, some of this is intentional, such as using MQTT’s topic string wild-card syntax, but the architecture is quite different. In a brokered pub/sub system publishers and subscribers typically maintain a long term connection to the broker. This is often necessary because the broker is running in the cloud and the subscriber and publishers are typically running behind a firewall, possibly NAT’d, and must establish an outbound connection to the broker to be able to communicate. DPS does not maintain long term connections, in fact connections only last long enough to send a single subscription or publication message. DPS uses hop-by-hop routing to forward publications to subscribers in the network. A DPS node with multiple network interfaces can forward pub/sub messages from one interface to another, there is no need for an end-to-end network route. In a conventional pub/sub system, publishers and subscriber send topic strings to the broker. The broker can essentially see as clear text every topic that passes through. In theory the individual elements in topic strings could be sent as hashes but that is not done currently. In DPS all publication and subscriptions are implicitly hashed and node only routes publications to nodes that have matching subscribers so there is typically no single point through which all messages pass.

Topic Strings

DPS like other pub/sub protocols expressed publications and subscriptions as structured text strings called topic strings. A topic string is a sequence of substrings delineated by a standalone separator character. In DPS almost any character or set of characters the publisher and subscriber agree on can be used as a separator. A publication matches a subscription if the substrings and separators in the publication are the same as the substrings and separators in the publication. Subscription topic strings can also include wildcard characters as described below. These are all valid publication topic strings:

foo/bar x,y,z 1.2.3 a/b/c?val=5

In the last example “/”, “?”, and “=” are separators. Separators must standalone, two or more consecutive separators are disallowed. Subscription topics strings have the same form as publication topics strings but can include wild-card characters. DPS uses the same wildcard characters as MQTT with the same meanings: the plus sign “+” wild card matches to any substring in the same position; the hash or pound sign “#” matches any number of trailing substrings. In DPS “+” and “#” are currently the only characters that are reserved. These are some valid wild-carded subscription topic strings:

+/bar x,+,z 1.# a/b/c?val=+

In MQTT and other pub/sub protocols a subscription or publication is a single topic string. A unique feature of DPS is that subscriptions and publications can have multiple topic strings. A subscription with more than one topic string will only match publications that have matching topic strings for all of topic strings in the subscription. As an example of how this might be used consider a set of devices that publish a topic string describing device type and a topic string describing the physical location of the device. An application could subscribe to all devices at a specific location by only specifying the location topic string, all devices of a specific type by only specifying the device type, or home in on a device with a specific type at specific location by using both topic strings in the same subscription. Another unique feature of DPS is that publisher control over the kinds of wild-card matches a subscriber is permitted to use. For example, a publisher can decide that wild-card matches must fully specify at least the first N elements in order to match. This offers a publisher control over wide-open wildcard subscriptions such as “+/#”, the most generic form allowed by DPS, that will match any publication with two or more elements.

DPS Mesh Network

A DPS network is a loosely connected collection of network nodes (devices) that function as subscriber, publishers, both or neither. Loosely connected means that the nodes hold state information about other nodes but do not necessarily maintain network connectivity to other nodes. There are no predesignated roles, any node can publish topics or subscribe to topics. DPS nodes build and manage routing tables based on the subscription topics and forward publications hop-by-hop from publishers to subscribers. Nodes with multiple network interfaces will automatically route publications between networks. Nodes can send publications on a local subnet using IP multicast, can be manually configured to connect to other nodes, or can use a directory service to locate other nodes. Publication routing is independent from the network level connections, so long as there is at least one network path DPS will route publications to all subscribing nodes.

The picture below shows a mesh of subscriber and publisher nodes.

DPS Mesh

Subscriptions flood throughout the network and can be forwarded in either direction, also there may be multiple routes. Publications only flow on routes that have matching subscriptions. Multicast publications are unsolicited and will be received by all nodes on the same subnet that are configured as multicast listeners. Unicast publications are only forwarded if the publication matches the subscriptions. In the steady state a publication reaching any node will be routed to all matching subscribers.

Message Types and Flow

DPS has three DPS message types, subscriptions, publications, acknowledgments. Subscriptions and publications are both inherently point-to-multipoint. An explicit assumption is that in IoT use cases there are many more publishers than subscribers, publications are sent fairly frequently but subscriptions are relatively stable, that is subscriptions do not change frequently. To a large extent DPS has been designed around these assumptions; DPS would be good for implement an IoT network with a large number of sensors but not ideal for implementing a highly scalable peer-to-peer chat service. When a subscription does change only deltas for the subscription propagate through the network, this is typically less than a 100 bytes. Publications are routed to all subscribers that have subscription topics that match the publication topics as described above.

Publications and acknowledgments can be accompanied by a payload, subscriptions do not carry a payload. Acknowledgements are optional and must be explicitly requested by the publisher when sending a publication. A subscriber can send an acknowledgement to the publisher along with an optional payload, the acknowledgement reaches the publisher by hop-by-hop forwarding in the reverse path of the publication. The reverse path ages out fairly quickly so acknowledgements should be sent as soon as possible after receipt of a publication. A publisher may receive multiple acknowledgments if there are multiple subscribers.

Data Series

DPS has built-in support for data series. In addition to topic strings, every publication has a UUID and serial number. Publications with same UUID (and topic strings) form a series, the serial number is incremented each time the publication is sent to the network. The UUID and serial number are available to the receiving subscribers.

Retained Publications

If there are no subscribers for the topic strings in a publication the publication will be discarded, typically at the first hop. There are use cases where it is desirable for a publication to be held for later delivery. MQTT has a feature that supports this use case; if an MQTT publication is flagged as “retained” the MQTT broker holds onto the publication until a subscriber is available to consume the message. DPS implements a similar feature where a TTL (time-to-live) can be set on a publication which is then held for delivery up until the time the TTL expires. For a publication to be retained it must match a subscription somewhere in the network, this allows specific nodes to take on responsibility for retaining matching publications. A retained publication can be replaced if the publisher sends a new publications having the same UUID but a later serial number. A retained publication can be explicitly expired by sending a new publication with the same UUID and a negative TTL.

Retained publications provide support for sleepy nodes, that is, nodes that are only periodically active on the network. For example, a wireless sensor node publishes a telemetry reading in the payload of a retained publication with a ten minute TTL then drops into low power mode waking within ten to deliver the next reading. The most recent telemetry will be available to subscribers no matter when they join the network. Similarly a sleepy subscriber can periodically connect to the network to receive publications updates.

Security

DPS has two security mechanisms: end-to-end encryption and link-layer encryption. Publication and acknowledgement messages can be encrypted end-to-end. The current implementation supports AES-CCM for message integrity checks over the body and payload and encryption of the payload. The format for encrypted messages is the COSE (CBOR Object Signing and Encryption) Encrypt0 format for authenticated and encrypted data as described in Internet draft cose-msg-24.

Note: Security is not fully implemented in the current preview release. Specifically only pre-shared-keys are supported for end-to-end encryption and link-layer encryption is not yet implemented.

For privacy and to protect the network hop-by-hop links DPS uses link layer encryption where possible: DTLS for UDP, TLS for TCP, and transport-specific encryption for over non-IP networks. IP multicast packets do not use link-layer encryption. Because DPS is fundamentally a multi-hop mesh protocol payloads are secured end-to-end.

As noted above publications and acknowledgement messages can be encrypted using COSE. The encryption in this case is end-to-end, the trust relationship is directly between the publishing node and the subscribing node, and intermediate nodes that simply route packets over the DPS mesh do not require a trust relationship with either the publishers or subscribers and do not hold decryption keys. The publication or acknowledgement data payload is encrypted, publications also carry the publication topic strings these are included in the encrypted payload. Message fields required for routing messages are integrity checked but obviously cannot be encrypted. The sending port number and TTL change on each hop so cannot be included in the end-to-end integrity check. The COSE format includes an optional KID (key-identifier) field. Encrypted DPS messages always have a 16 byte KID which allows the different publications to use different encryption keys.

Subscriptions do not carry payload data and the bit vectors carried in the payloads get recomputed at each hop. Unlike publications and acknowledgements, subscriptions do not use COSE and rely solely on encryption at the link-layer. Even when data is encrypted there are attack vectors based on traffic analysis. Traffic analysis is harder on a multi-hop mesh but because publications contain routing information that cannot be encrypted end-to-end. Specifically an attacker with access to a compromised node could use a dictionary attack to identify the topic strings in the publications and subscriptions flowing through that node. Where this is a concern the publishing and subscribing end-points can agree on a shared private encoding of topic strings, for example HMAC with a shared seed.

Contents of Preview Release

The preview release is a fully functional implementation of the DPS framework as described implemented in C. There is relatively complete documentation of the C APIs. For portability the implementation uses the libuv asynchronous I/O network library and currently builds and runs on Linux and Windows. There are also language bindings for Python and Node.js are generated using SWIG. There is sample code for each of the currently supported languages.

DPS Docs

Clone this wiki locally