Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile differences between threshold calculation here and in paper. #23

Open
rphmeier opened this issue Nov 1, 2018 · 3 comments
Open

Comments

@rphmeier
Copy link
Contributor

rphmeier commented Nov 1, 2018

The paper uses (n+f+1)/2 and we use n-f -- for small authority sets this is
a tradeoff in safety and liveness.

cc @AlistairStewart

@romanb
Copy link
Contributor

romanb commented Oct 20, 2019

Could you elaborate on the tradeoff between safety and liveness? I mean, for n < 4 where f = 0, we have

n = 1 ==> threshold >= (n + f + 1) / 2 = (1 + 0 + 1) / 2 = 1
n = 2 ==> threshold >= (n + f + 1) / 2 = (2 + 0 + 1) / 2 = 3/2 (so threshold >= 2)
n = 3 ==> threshold >= (n + f + 1) / 2 = (3 + 0 + 1) / 2 = 2

and for n > 3 where f >= 1, we have

n >= 3f + 1 ==> (n + f + 1) / 2 >= (4f + 2) / 2 = 2f + 1

so >= (n + f + 1) / 2 is always a safe threshold for a supermajority, isn't it? Using n - f seems to unnecessarily require one more vote for a supermajority than necessary whenever n is a multiple of 3.

@rphmeier
Copy link
Contributor Author

rphmeier commented Oct 21, 2019

n = 3 ==> threshold >= (n + f + 1) / 2 = (3 + 0 + 1) / 2 = 2

Well, you're probably right, but I don't think it makes much difference in practice. If you get that 1 extra misbehaving validator, n - f keeps you safe but not live (as they can withhold and prevent quorum).

Liveness failures in GRANDPA are easier to recover from than safety failures. We could make this configurable, for those who'd like to make a different trade-off.

@romanb
Copy link
Contributor

romanb commented Oct 22, 2019

I see, thanks. So to summarise: Both n - f and (n + f + 1) / 2 can tolerate (w.r.t. both safety and liveness) the same number of byzantine faults for any n, but whenever n is a multiple of 3, n - f makes a preference for safety in the face of one more byzantine fault at the cost of liveness if the fault is indeed non-byzantine (e.g. crash fault) whereas (n + f + 1) / 2 makes a preference for liveness as it can tolerate one more non-byzantine fault (e.g. with n = 3 it can make progress with supermajority 2, at n = 6 with supermajority 4 etc).

I'm not sure that distinction is really worth deviating from the paper, mainly because any deviation that is not very prominently documented is a potential cause of confusion for anyone reviewing the code, but if n - f is supposed to stay in the implementation, then obviously the only option for reconciliation (and addressing this issue) is to change the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants