Substrate Runtime Interface #2334

pepyakin · 2019-04-20T11:21:10Z

The current substrate runtime API is not the easiest to work with:

We have a couple of different runtime environments (with_std/without_std - or rather, staticly linked native code and wasm runtime), however interface should be the same.
Error handling might differ between environments.
Implementation and interface are located in different places but they are really tightly coupled and I mean it: say, a resulting pointer from runtime API must have a certain alignment or we depend on that in unsafe code. TBH I won't be surprised if there is broken unsafe code out there since it is so easy to break it, so hard to check, and we don't really treat this code as unsafe (e.g. there are no proofs on safety).
These unsafe invariants not usually stated anywhere, they are just in code. And there is also lack of documentation and the reason for that might be it is just not clear where this doc belongs.
There is this weird behavior of impl_function_executor, which allows you declare parameters such as usize which, however, will have u32 type, which might be baffling for new people working with this code. There was a PR that introduces an honest Rust interface, but unfortunately it has a downside: there is a desire for signatures of host functions to be trivially copyable without any changes to reduce error proness.
There are some differences and pitfalls to keep in mind when designing Substrate Runtime APIs. For example, multiple returns are not supported and thus if there is need to return multiple values it should be workarounded somehow. And yeah there are some inconstencies at the moment in the current implementation.

Basically, this is very error-prone and very boilerplaty code. This sounds as a good use-case for a code-generation. Here is my strawman proposal for such code generator:

We could introduce a special AST that describes an interface between the substrate runtime and the substrate host. It would support rather high-level types, e.g.: bytes_vec/bytes, bool, (T, J) (a tuple), *T (a raw pointer), [T; N] for parameters and return values. We could also declare if a function traps, maybe with a type Result or a special annotation.

Here is an example how it could look like:

# Not a rust file

# usize is not supported deliberatly, use u32.
fn malloc(size: u32) -> *const u8;
fn free(ptr: *const u8);

# `bytes` is converted to `Vec<u8>` on the substrate host side, and `&[u8]` on the rust side.
fn storage_exists(key: bytes) -> bool;

# `bytes?` - notation for optional value (for denoting absence of a value under the given key)
# note that we don't deal with problems such as how to return a bytes vector (which is two components `ptr` and `len`) via wasm: all such ABI details will be handled by codegeneration.
# note that we also don't need to care about details as creating slices with `from_raw_parts`, and caring about upholding all invariants.
fn child_storage(storage_key: bytes, key: bytes) -> bytes?;

and etc.
(Note that a new language is not necessary for this, we only need a model definition, which could be a rust expresion creating the model struct or it could be yaml)

Having this model, we can use some build.rs code generation for building definitions for several crates, such as:

sri-guest - declarations of every API function for usage from the runtime side (hence guest). Probably has without_std and with_std versions generated. Supersedes most part of sr-io, allocator part from sr-std, externs part of srml-sandbox.
sri-host-wasmi - a glue code that dispatches a call to Externals to the appropriate function in some trait, supersedes current impl_function_executor. This trait would be implemented in today's wasm_executor.rs.

Code generation gives a lot of benefits, here are some:

As was said, it removes duplication decreases error proness.
The API is in one place, so we it could be more consistent.
One single place for documentation of Substrate Runtime Interface,
All quirks of implementation is implemented only once per type, not for every case.
Definitions could potentially be used by other implementations of substrate runtime interface,
We would be able experiment more easily. For example, we could benchmark different ways of passing values, what works best, etc.
It would make us to decouple from wasmi and integrate other engines much more easily. The thing is we most likely want to have a couple of wasm engines at the same time (i.e. we want to have compilers for performance reasons, but we also want to have wasmi because it is super robust). So we just could generate different boilerplate for calling some trait methods and just dispatch them differently depending on the engine (Externals for wasmi, potentially extern "C" fn for others).
We would be able to easily change our wasm ABI: there is a proposal for multi-value return values in wasm. Code generation would make it easier to migrate to this.
It might give us ability to introduce versioning to (akin to impl_runtime_apis macro, but the other way around? : ) )
I might be wrong but maybe it will provide us an easier way to implement cumulus and maybe make this code a bit cleaner.

The text was updated successfully, but these errors were encountered:

pepyakin · 2019-05-20T11:53:56Z

Related PR: #2381. It can be seen as a first step towards that direction.

Demi-Marie · 2019-05-24T12:24:56Z

I think we should go farther and avoid raw pointers in our APIs.

pepyakin · 2019-05-24T15:59:29Z

Then how would you implement low-level APIs that depend on pointers, e.g. fn malloc(size: u32) -> *const u8;? TBH, I think I am ok with having pointers in the API as long the bytes type used whenever is possible.

Demi-Marie · 2019-05-27T16:33:07Z

@pepyakin I do not believe that the host needs to provide malloc. When compiled for wasm, the runtime will use its own allocator, and when compiled natively, the runtime will use the system malloc routines.

Demi-Marie · 2019-05-27T16:33:33Z

More specifically, I want to avoid needing unsafe code on the host side, which is a potential security risk.

bkchr · 2019-05-27T18:07:28Z

@demimarie-parity this does not work... We provide our own allocator, but even the allocator interface uses raw pointers....

Pointers are not bad.

pepyakin · 2019-05-27T19:20:08Z

Well, technically the code on the host side wont be unsafe since it deals only with wasm memory (for reference, take a look at wasm_executor.rs). That said, it is still practically unsafe since the host can corrupt the wasm memory if due care is not taken. And this is actually what this task proposes to solve (instead of using raw pointers we would have the bytes type)

For wasm we decided to provide the allocator. See the discussion in #300

Demi-Marie · 2019-05-29T01:37:40Z

This is a prerequisite for my work on sandboxing.

pepyakin · 2020-06-25T10:20:18Z

I believe this was implemented in some form.

pepyakin added the J0-enhancement An additional feature request. label Apr 20, 2019

pepyakin added this to the As-and-when milestone Apr 20, 2019

pepyakin mentioned this issue May 20, 2019

Wasm execution engine for runtime #2634

Closed

pepyakin mentioned this issue Jul 13, 2019

Introduce srml/im-online #3079

Merged

pepyakin closed this as completed Jun 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Substrate Runtime Interface #2334

Substrate Runtime Interface #2334

pepyakin commented Apr 20, 2019

pepyakin commented May 20, 2019 •

edited

Loading

Demi-Marie commented May 24, 2019

pepyakin commented May 24, 2019

Demi-Marie commented May 27, 2019

Demi-Marie commented May 27, 2019

bkchr commented May 27, 2019

pepyakin commented May 27, 2019

Demi-Marie commented May 29, 2019

pepyakin commented Jun 25, 2020

Substrate Runtime Interface #2334

Substrate Runtime Interface #2334

Comments

pepyakin commented Apr 20, 2019

pepyakin commented May 20, 2019 • edited Loading

Demi-Marie commented May 24, 2019

pepyakin commented May 24, 2019

Demi-Marie commented May 27, 2019

Demi-Marie commented May 27, 2019

bkchr commented May 27, 2019

pepyakin commented May 27, 2019

Demi-Marie commented May 29, 2019

pepyakin commented Jun 25, 2020

pepyakin commented May 20, 2019 •

edited

Loading