Skip to content
This repository has been archived by the owner on Jul 27, 2022. It is now read-only.

Problem: no safe stake tx retry logic in client-* #1293

Closed
tomtau opened this issue Mar 24, 2020 · 12 comments
Closed

Problem: no safe stake tx retry logic in client-* #1293

tomtau opened this issue Mar 24, 2020 · 12 comments
Assignees
Milestone

Comments

@tomtau
Copy link
Contributor

tomtau commented Mar 24, 2020

as per crypto-org-chain/chain-docs#112 (comment)
if the amounts or core items (e.g. jailed) did not change, a new tx against an updated nonce may be re-created

@tomtau
Copy link
Contributor Author

tomtau commented Mar 26, 2020

some ideas for solution parts:

  • use "tendermint transaction hash" (different from chain txid) and keep track of it (and use e.g. tx RPC endpoint) OR make sure TM is configured to index chain txid
  • assigning non-zero numbers to validation errors and returning those in deliver_tx/check_tx and checking that from block_results or other RPC call
  • on nonce fail, retrieving latest state, checking balance (or what needed) and recreating transaction if safe to do

@yihuang
Copy link
Collaborator

yihuang commented Mar 26, 2020

I think it's more flexible to provide build_* API to build raw transactions, client has the freedom to use different broadcast API to send. For this issue, the client can use broadcast_commit to get check/deliver results immediately.

loop:
  raw_tx = client_rpc.build_send_tx(...)
  result = tendermint_rpc.broadcast_commit(raw_tx)
  if !is_non_mismatch(result):
     break

EDIT: there'll be issues with utxo state, and pending logic.

@tomtau
Copy link
Contributor Author

tomtau commented Mar 26, 2020

For development / integration tests, broadcast_commit may be fine.
In RPC docs, it says though:

IMPORTANT: use only for testing and development. In production, use BroadcastTxSync or BroadcastTxAsync. You can subscribe for the transaction result using JSONRPC via a websocket. See https://docs.tendermint.com/master/app-dev/subscribing-to-events-via-websocket.html

@yihuang
Copy link
Collaborator

yihuang commented Mar 26, 2020

Currently, when the transaction failed in abci, what happens to the related utxo state and pending state in client? They won't recover I think?

@tomtau
Copy link
Contributor Author

tomtau commented Mar 26, 2020

I think it gets removed after some number of blocks since the recorded pending info? @linfeng-crypto

@yihuang
Copy link
Collaborator

yihuang commented Mar 26, 2020

After some further thought, fetch nonce and retry the transaction, if the node connected to is not trusted, it'll again subject to the replay attack, because the node can retain the transactions signed against different nonces?

@yihuang
Copy link
Collaborator

yihuang commented Mar 26, 2020

Internal staking mutations and nonce

Internal staking mutations cases

  • Validator rewarding
  • Validator punishments(slash and jail)
  • Validator info cleanup

Increase nonce or not

When internal staking mutation happens, should we increase the nonce or not?

Both can prevent replay attacks.

Increase

The semantic of the nonce is the update counter of staking state.

Pros

  • The staking state is guaranteed to be unmodified between you query the state and execute the transaction.

Cons

  • From the point of view of the client, the transaction could fail randomly.

    And blindly re-fetch the nonce and re-sign the transaction might subject to re-play attack.

Not increase

At the same time, add amount field to withdraw transaction to do a sanity check (optional), if it's not equal to the actual amount, abort the transaction.

The semantic of the nonce is the number of transactions sent from a given address,
e.g. deposit don't increase the nonce of receiving address, only withdraw/unbond/unjail/joinnode increase the nonce.

From another angle, the nonce only increased if the tx has a witness for this staking address, e.g. only the owner is able to change it.

Pros

  • From my understanding, this is in line with ethereum logic.

  • From the point of view of the client, the nonce is deterministic.

    The transaction might still fail unintentionally because of internal balance changes, but it's less frequent, and the operator should not retry blindly in these cases.

  • The client can be more efficient to cache and increase nonce locally, rather than query nonce every time when preparing transactions of related tx types. It might be quite important for sending multiple sequential transactions effiently.

Cons

  • ?

@tomtau
Copy link
Contributor Author

tomtau commented Mar 27, 2020

After some further thought, fetch nonce and retry the transaction, if the node connected to is not trusted, it'll again subject to the replay attack, because the node can retain the transactions signed against different nonces?

only "nonce" -- only one of them is valid -- there's no replay.

the client runs the light client verification -- altogether with retrieving the staked state, it should also ask for the inclusion proof:

  • full node generates it with generate_inclusion_proof
  • light client verify_inclusion_proof

(in the light client verification, the client needs to keep track of the trie root when verifying app hash)

I guess it's not doing it right now?

@tomtau
Copy link
Contributor Author

tomtau commented Mar 27, 2020

Internal staking mutations and nonce

probably docs can be updated and code that nonce here will be the "(from) transaction counter" (it was like that before it got more complex) -- cons are some other things may still go wrong from the point of the client, so it's a question whether to also include e.g. expected input amount in tx payload that would get cross-checked.

increasing nonce only for withdraw/unbond/unjail/joinnode makes sense

@yihuang
Copy link
Collaborator

yihuang commented Mar 27, 2020

After some further thought, fetch nonce and retry the transaction, if the node connected to is not trusted, it'll again subject to the replay attack, because the node can retain the transactions signed against different nonces?

only "nonce" -- only one of them is valid -- there's no replay.

the client runs the light client verification -- altogether with retrieving the staked state, it should also ask for the inclusion proof:

  • full node generates it with generate_inclusion_proof
  • light client verify_inclusion_proof

(in the light client verification, the client needs to keep track of the trie root when verifying app hash)

I guess it's not doing it right now?

Yeah, client hasn't verified the inclusion proof of staking state yet.

@tomtau
Copy link
Contributor Author

tomtau commented Mar 27, 2020

ok, opened: #1313

@yihuang
Copy link
Collaborator

yihuang commented Mar 27, 2020

nonce logic update:
The spec is changed in crypto-org-chain/chain-docs#116.
The implementation is changed in #1142.

@yihuang yihuang closed this as completed Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants