-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource limiting #203
Comments
Given that the server would return an error response to seemingly valid request, has to handled on the client side. I'd suggest to implement a simple re-try (with exponential backoff) strategy in the client libraries of |
Following @tomusdrw , IMHO is very useful is there is different reposes for following:
If there would be ever need to write load balancer and put in front of multiple instances, such granularity of responses allow smooth restarts when rolling out new releases during release engineering (I mean lameduck mode) and overloaded for smarter load balancing and back off strategies. Still, not priority, not required ASAP, just nice to make it open topic for future (i.e. make sure we don't make accidentally roadblocks for those things ;) ). |
I like high level design a lot! Low level wise, lower granularity is basically pattern of vector of Semaphores. |
Kindly ping, how progress in this ticket is going? :) |
No progress at all, currently we are working on #251 after that we will start working on this |
Happy to see it assigned to release v0.4 . How is it going on this one, as we had discussions related with DDoS attacks recently and I find design Implemented in this issue relevant :). (especially "The dream solution" , as we were discussing this today, i.e. throttling based on vector of resources instead of scalar) |
A common problem JSON-RPC servers run into is making sure the public facing API can be made resilient to DOS attacks (or just plain, unintentional overload). The following are solutions we'd like to see in jsonrpsee, following a chat with @gww-parity and @niklasad1.
MVP (2.0 stable release)
The simplest solution is to provide a single cap of in-flight requests, either globally for the HTTP server, or per connection for the WS server, with an additional cap for the number of connections.
This is pretty crude since we can't distinguish between requests that are blocking on CPU, IO, or some other reason, but it should be a sufficient approximation given a limit being some factor of available CPU cores. The implementation is pretty low effort here since we are already using bounded channels internally, so providing an informative error objects when the limit is hit would be sufficient here. For the WS server adding a single atomic int to track resources globally should also work fine and not deteriorate performance too much given we are in macrosecond space already due to IO.
There is no defined error code for the server being busy, so we'll need to use either
-32603
for internal error, or define a custom error code in the-32000
to-32099
range.The dream solution (undefined future release)
Down the line we should be able to provide a more granular control over resources for advanced users. To make this performant on runtime we need to make sure that we are using a builder pattern for method definitions before starting a server (already the case for WS).
The user should be able to register a resource on the server, similarly to the way they register a method. For each resource the user defines it's name as a string, and an arbitrary integer value that reflects the maximum capacity of that resource and the default cost, e.g.:
Each method registered on the builder would default to having all of its resource costs set to the provided defaults. Alternatively, users should be able to set individual resources, either using a map, or with some method chaining in the api, e.g.:
The limits are then checked (naively globally) for each defined resource before triggering the request, and if exhausted an error object is returned, same as the MVP implementation.
To keep things fast at runtime we map each defined resource name to an index in an array (there should be a limit to how many resources a user can register), so in the examples above
"cpu"
would be0
,"io"
would be1
. On runtime each request would then simply have a fixed-size array of integers (constructed when the builder is finalized) to check against a global array of limits, which should be fairly quick and easy to optimize with vector instructions if need be (assuming compiler doesn't auto-vectorize it).The text was updated successfully, but these errors were encountered: