Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource limiting #203

Closed
maciejhirsz opened this issue Feb 9, 2021 · 7 comments · Fixed by #500
Closed

Resource limiting #203

maciejhirsz opened this issue Feb 9, 2021 · 7 comments · Fixed by #500

Comments

@maciejhirsz
Copy link
Contributor

maciejhirsz commented Feb 9, 2021

A common problem JSON-RPC servers run into is making sure the public facing API can be made resilient to DOS attacks (or just plain, unintentional overload). The following are solutions we'd like to see in jsonrpsee, following a chat with @gww-parity and @niklasad1.

MVP (2.0 stable release)

The simplest solution is to provide a single cap of in-flight requests, either globally for the HTTP server, or per connection for the WS server, with an additional cap for the number of connections.

This is pretty crude since we can't distinguish between requests that are blocking on CPU, IO, or some other reason, but it should be a sufficient approximation given a limit being some factor of available CPU cores. The implementation is pretty low effort here since we are already using bounded channels internally, so providing an informative error objects when the limit is hit would be sufficient here. For the WS server adding a single atomic int to track resources globally should also work fine and not deteriorate performance too much given we are in macrosecond space already due to IO.

There is no defined error code for the server being busy, so we'll need to use either -32603 for internal error, or define a custom error code in the -32000 to -32099 range.

The dream solution (undefined future release)

Down the line we should be able to provide a more granular control over resources for advanced users. To make this performant on runtime we need to make sure that we are using a builder pattern for method definitions before starting a server (already the case for WS).

The user should be able to register a resource on the server, similarly to the way they register a method. For each resource the user defines it's name as a string, and an arbitrary integer value that reflects the maximum capacity of that resource and the default cost, e.g.:

 //                       label, limit, default cost
builder.register_resource("cpu", 100, 1);
builder.register_resource("io", 500, 10);

Each method registered on the builder would default to having all of its resource costs set to the provided defaults. Alternatively, users should be able to set individual resources, either using a map, or with some method chaining in the api, e.g.:

builder.register_method("cpu_heavy", ...).set_resource("cpu", 20);

The limits are then checked (naively globally) for each defined resource before triggering the request, and if exhausted an error object is returned, same as the MVP implementation.

To keep things fast at runtime we map each defined resource name to an index in an array (there should be a limit to how many resources a user can register), so in the examples above "cpu" would be 0, "io" would be 1. On runtime each request would then simply have a fixed-size array of integers (constructed when the builder is finalized) to check against a global array of limits, which should be fairly quick and easy to optimize with vector instructions if need be (assuming compiler doesn't auto-vectorize it).

@tomusdrw
Copy link
Contributor

tomusdrw commented Feb 9, 2021

There is no defined error code for the server being busy,

Given that the server would return an error response to seemingly valid request, has to handled on the client side.

I'd suggest to implement a simple re-try (with exponential backoff) strategy in the client libraries of jsonrpsee as part of that issue as well.

@gww-parity
Copy link

gww-parity commented Feb 9, 2021

Following @tomusdrw , IMHO is very useful is there is different reposes for following:

  • each kind of Error
  • lameduck -> i.e. "I am not receiving requests anymore" -> useful when you want to shut down server, first you turn it into lameduck mode, so it does not receive new requests, and you wait until finish with current ones and then shutdown/restart when reaches 0 in-flight requests
  • overloaded - like lame duck mode.

If there would be ever need to write load balancer and put in front of multiple instances, such granularity of responses allow smooth restarts when rolling out new releases during release engineering (I mean lameduck mode) and overloaded for smarter load balancing and back off strategies.

Still, not priority, not required ASAP, just nice to make it open topic for future (i.e. make sure we don't make accidentally roadblocks for those things ;) ).

@gww-parity
Copy link

I like high level design a lot!

Low level wise, lower granularity is basically pattern of vector of Semaphores.

@gww-parity
Copy link

Kindly ping, how progress in this ticket is going? :)

@niklasad1
Copy link
Member

No progress at all, currently we are working on #251 after that we will start working on this

@gww-parity
Copy link

I see #251 , can't wait for #203 to be back on a table ;)

@gww-parity
Copy link

Happy to see it assigned to release v0.4 . How is it going on this one, as we had discussions related with DDoS attacks recently and I find design Implemented in this issue relevant :). (especially "The dream solution" , as we were discussing this today, i.e. throttling based on vector of resources instead of scalar)

This was referenced Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants