Skip to content
This repository has been archived by the owner on Aug 2, 2021. It is now read-only.

Snapshot test #1284

Closed
wants to merge 17 commits into from
Closed

Snapshot test #1284

wants to merge 17 commits into from

Conversation

gluk256
Copy link

@gluk256 gluk256 commented Mar 6, 2019

Tests the message propagation in simulated networks.

supercedes #1016

closes #987

@gluk256 gluk256 requested a review from nolash March 6, 2019 11:54
@gluk256 gluk256 self-assigned this Mar 6, 2019
@gluk256 gluk256 requested a review from holisticode March 6, 2019 11:54
Copy link
Contributor

@holisticode holisticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Some of my comments are redundant, and some are just about asking more comments / docs.

I can't really assess the semantics of the tests though, I lack understanding of what the desired behavior really is, also taking into consideration @nolash 's comment in gitter about the issues of these tests not being overcome, I don't know what else needs to be done

var prev int

// loop through all nodes and add the message to recipient indices
for _, nod := range sim.Net.GetNodes() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few implicit assumptions here which are not clear for someone not really informed about what needs to be done (including me). Please add more documentation of what's going on

}

if po >= depth {
maxMessages++
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, why increase the number of maxMessages only if po >= depth?

Copy link
Author

@gluk256 gluk256 Mar 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because in this case we expect the msg to be delivered to this node

allowedMsgs[nod.ID()] = append(allowedMsgs[nod.ID()], uint64(i))
}

// a node with the smallest PO (wrt msg) will be the sender
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? Why is this?

Copy link
Author

@gluk256 gluk256 Mar 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to see if the msg will propagate properly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holisticode or in other words; we want to maximize the distance the message has to travel. Therefore we select the node that's farthest away from the message address.

return snap
}

func assingTestVariables(sim *simulation.Simulation, msgCount int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the logic of when a node is supposed to receive a message and when not. This should be described somewhere (apologies if it's there and I couldn't find or understand it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The test needs a very clear description on the feature it tests and how it tests it. This is pretty complicated stuff.

// however, it might just mean that not all possible messages are received
// now we must check if all required messages are received
log.Debug("--------------------------------------------------------------------------------", "rcv", msgCnt)
if msgCnt < msgsToReceive {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the difference between msgsToReceive and maxMessages

return snap
}

func assingTestVariables(sim *simulation.Simulation, msgCount int) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assingTestVariables
?

}

func TestProxNetwork(t *testing.T) {
if (*runNodes > 0 && *runMessages == 0) || (*runMessages > 0 && *runNodes == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set proper defaults, no need to check

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is needed in case of invalid explicit input

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to add the flag in the actual test itself, since these flags are only relevant for this specific test - unlike the pss_test.go flags, which apply to the full package test. Keep in mind you will need flag.Parse() too in that case (you get it for free now with pss_test.go:init()

t.Logf("completed %d", result.Duration)
}

func sendAllMsgs(sim *simulation.Simulation, msgs [][]byte, senders map[int]enode.ID) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use rpc here? just 'bucket' the pss object

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, i am not sure. i inherited this code, and it works :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zelig all the send and network tests in pss_test.go use RPC. If we are to change this, please let's change it in one go when we clean up the test files, which is a pending task anyway.

However, let's also keep in mind that most usage of Pss is indeed through RPC. Thus I don't think it's unreasonable to use RPC calls in these higher level tests.

}
}
}
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unreacable code?

Copy link
Author

@gluk256 gluk256 Mar 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some compilers might not detect this, and wihtout this statement they will complain that function does not return value.

select {
case <-doneC: // graceful exit
setDone()
errC <- nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will block if doneC is closed on line 263, no?

@@ -0,0 +1,433 @@
package pss
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this filename is incorrect. only testing prox

if err := p.enqueue(pssmsg); err != nil {
return err
}
if len(pssmsg.To) < addressLength || prox {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can submit this fix separately if you want quick result :) the rest will take a bit...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I warmly encourage this, too.

// within their nearest neighborhood depth, and stores them as recipients.
// Upon sending the messages, it verifies that the respective message is passed to the message handlers of these recipients.
// It will fail if a recipient handles a message it should not, or if after propagation not all expected messages are handled (timeout)
func testProxNetwork(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your test is functionally LGTM but i feel it could be improved and simplified.

  • global vars
  • RPC vs API call via bucket object
  • snapshot handling?
  • flags
  • conn labels

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur on global vars, snapshot handling and flags. My opinions given in individual comments.

mu sync.Mutex // keeps handlerDonc in sync
sim *simulation.Simulation

handlerDone bool // set to true on termination of the simulation run
Copy link
Contributor

@nolash nolash Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you still are using globals in the test. I've already given my opinion on this; I think it is risky design, and I have even gotten scorn from the golang irc channel for doing the same myself at some point (code smell, I believe they called it).

At a minimum I would recommend creating a struct that can encapsulate the state and all methods accessing the state (isDone(), setDone()), which every relevant method gets passed. I will not recommend merging this PR without this amendment.

I would also recommend that we look at how to refactor this test to fit the action/trigger/expectation paradigm of p2p/simulations.go:Simulation but I recommend that can be a separate PR after this.

t.Logf("completed %d", result.Duration)
}

func sendAllMsgs(sim *simulation.Simulation, msgs [][]byte, senders map[int]enode.ID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zelig all the send and network tests in pss_test.go use RPC. If we are to change this, please let's change it in one go when we clean up the test files, which is a pending task anyway.

However, let's also keep in mind that most usage of Pss is indeed through RPC. Thus I don't think it's unreasonable to use RPC calls in these higher level tests.

@@ -96,3 +98,100 @@ func (s *Simulation) kademlias() (ks map[enode.ID]*network.Kademlia) {
}
return ks
}

func (s *Simulation) WaitTillSnapshotRecreated(ctx context.Context, snap simulations.Snapshot) error {
expected := listSnapshotConnections(snap.Conns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use prefix get not list ... we don't seem to use "list" for this behavior elsewhere:

find $GOPATH/src/github.com/ethereum/go-ethereum/swarm/ -iname "*.go" -exec grep -P "list\S*\(" {} \;

func isAllDeployed(expected []uint64, actual []uint64) bool {
exp := make([]uint64, len(expected))
copy(exp, expected)
if len(exp) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cosmetic nitpick; rather check len(expected) before anything else

allowedMsgs[nod.ID()] = append(allowedMsgs[nod.ID()], uint64(i))
}

// a node with the smallest PO (wrt msg) will be the sender
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holisticode or in other words; we want to maximize the distance the message has to travel. Therefore we select the node that's farthest away from the message address.


msgsToReceive += len(targets)
for _, id := range targets {
recipients[i] = append(recipients[i], id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't you assign directly to recipients[] above? (if you implement a struct to hold the state this will be a single method call to update both arrays).

Copy link
Author

@gluk256 gluk256 Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because in some cases targets might be reset (if new closest is found).
please see line 157:

targets = nil

po, _ := pof(msgs[i], nodeAddrs[nod.ID()], 0)
depth := kademlias[nod.ID()].NeighbourhoodDepth()

// only nodes with closest IDs (wrt msg) will receive the msg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only nodes with closest IDs

This doesn't really explain the distinction between "target" and "allowed." Can you please be more specific?

if err != nil {
t.Fatalf("failed to recreate snapshot: %s", err)
}
assingTestVariables(sim, msgCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct this typo; assing -> assign

case hn := <-msgC:
received++
log.Debug("msg received", "msgs_received", received, "total_expected", msgsToReceive, "id", hn.id, "serial", hn.serial)
if received >= maxMessages {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we allow received > maxMessages?

@nolash
Copy link
Contributor

nolash commented Mar 8, 2019

For the record, during meeting yesterday the neighborhood reciprocity issue was discussed with @zelig, and we agreed that indeed the functionality tested here - where only the closest peer to the message can be guaranteed recipient - is the best we can do, at least according to the thinking of the current time.

@@ -96,3 +98,107 @@ func (s *Simulation) kademlias() (ks map[enode.ID]*network.Kademlia) {
}
return ks
}

// WaitTillSnapshotRecreated is blocking until all the connections specified
// in the snapshot are actually up and running.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not quite precise. It's until all the connections are registered in the kademlia*

for _, c := range conns {
res = append(res, getConnectionHash(c.One, c.Other))
c.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

if err != nil {
t.Fatal(err)
}
ctx, cancel := context.WithTimeout(context.Background(), time.Second*3)
ctx, cancel := context.WithTimeout(context.Background(), time.Second*60)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need different timeouts for different tests params?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see how i can do better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any idea how long those longrunning tests take to complete?

func runFunc(ctx context.Context, sim *simulation.Simulation) error {
go handlerChannelListener(ctx)
go sendAllMsgs(sim, msgs, senders)
func runFunc(tstdata *testData, ctx context.Context, sim *simulation.Simulation) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment the individual functions.

if err != nil {
t.Fatalf("failed to recreate snapshot: %s", err)
}
assingTestVariables(sim, msgCount)
result := sim.Run(ctx, runFunc)
initializeTestData(&tstdata, msgCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please consider inline comments in the tests. Here, for example,

// initialize and run the test

it may seem mundane, but I find it really helps readability.

log.Debug("msg received", "msgs_received", received, "total_expected", tstdata.requiredMessages, "id", hn.id, "serial", hn.serial)
if received == tstdata.allowedMessages {
tstdata.doneC <- struct{}{}
close(tstdata.doneC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it enough to merely close here?

Copy link
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -96,3 +98,106 @@ func (s *Simulation) kademlias() (ks map[enode.ID]*network.Kademlia) {
}
return ks
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we should submit kademlia changes as a separate PR tracking as #1298

return int(msgCount), int(nodeCount)
}

func readSnapshot(t *testing.T, nodeCount int) simulations.Snapshot {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this not move under swarm/network/simulation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this is just a helper function, only relevant to this particular test, and not needed otherwise.

func initializeTestData(d *testData, msgCount int) {
log.Debug("TestProxNetwork start")
d.nodeAddrs = make(map[enode.ID][]byte)
d.recipients = make(map[int][]enode.ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes make it in the constructor?


// Here we test specific functionality of the pss, setting the prox property of
// the handler. The tests generate a number of messages with random addresses.
// Then, for each message it calculates which nodes in the network the msg address
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which nodes in the network have the msg

// recipients. The difference between allowed and required recipients results
// from the fact that the nearest neighbours are not necessarily reciprocal.
// Upon sending the messages, the test verifies that the respective message is
// passed to the message handlers of these required recipients. Test will fail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test fails if

}

// runFunc is the main test function, called by Simulation.Run()
func runFunc(tstdata *testData, ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we rename this?

t.Logf("completed %d", result.Duration)
}

func sendAllMsgs(tstdata *testData) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be a func with testData receiver no?


if !isAllDeployed(b, c) {
t.Fatal("isAllDeployed failed")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and the positive? :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition remains valid (if i understand your question correctly)

func initializeTestData(d *testData, msgCount int) {
log.Debug("TestProxNetwork start")
d.nodeAddrs = make(map[enode.ID][]byte)
d.recipients = make(map[int][]enode.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func newTestData() *tstData {
    return &tstData{
        d.nodeAddrs: make(map[enode.ID][]byte),
        d.recipients: make(map[int][]enode.ID),
        d.allowed: make(map[int][]enode.ID),
        d.expectedMsgs: make(map[enode.ID][]uint64),
        d.allowedMsgs: make(map[enode.ID][]uint64),
        d.senders: make(map[int]enode.ID),
        handlerC: make(chan handlerNotification),
        doneC: make(chan struct{}),
        errC: make(chan error),
        msgC:  make(chan handlerNotification),
        kademlias: map[enode.ID]*network.Kademlia,
    }
}

func (d *tstData) init() {
    log.Debug("TestProxNetwork start")
    for _, nodeId := range d.sim.NodeIDs() {
            d.nodeAddrs[nodeId] = nodeIDToAddr(nodeId)
    }
    [....]
}

func testProxNetwork(t *testing.T) {
    testData := newTestData()
    [...]
}

Copy link
Contributor

@nolash nolash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience :)

@zelig
Copy link
Member

zelig commented Mar 15, 2019

@gluk256 approved as is, but consider following up with #1298

@zelig
Copy link
Member

zelig commented Mar 16, 2019

merged as ethereum/go-ethereum#19278

@zelig zelig closed this Mar 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PSS neighbourhood addressing simulation
4 participants