-
Notifications
You must be signed in to change notification settings - Fork 2
Add Test Suite for Long Running Tests #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
yacovm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made a quick pass, I think it's pretty useful overall. we can build something on top that will generate random test cases and just run this.
Will make another pass later.
| } | ||
| } | ||
|
|
||
| func (n *LongRunningInMemoryNetwork) waitUntilAllRoundsEqual() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this function return even if no blocks were committed? e.g at round 0?
Shouldn't we perhaps pass in some kind of predicate on the round / sequence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, i'm using it as more of a helper function for StopAndAssert but I can see a predicate being helpful if we decide to expose this function.
| } | ||
|
|
||
| // AssertHealthy checks that the WAL has at most one of each record type per round. | ||
| func (tw *TestWAL) AssertHealthy(bd simplex.BlockDeserializer, qcd simplex.QCDeserializer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we have a notarization and an empty notarization in the WAL? via replicating one of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea we can have one of each for the same round, but we should never have two of the same one for the same round.
ca4efa4 to
b5ec286
Compare
| } | ||
|
|
||
| // compare finalizations | ||
| if item.Finalization.Finalization.Digest != otherItem.Finalization.Finalization.Digest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just use: item.Finalization.Bytes() to compare the two?
| func TestLongRunningReplication(t *testing.T) { | ||
| net := testutil.NewDefaultLongRunningNetwork(t, 10) | ||
| for _, instance := range net.Instances { | ||
| instance.SilenceExceptKeywords("Received replication response", "Resending replication requests for missing rounds/sequences") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we're doing this. There is nothing in the test that I see that requires intercepting the log, so why do we care that it's printed?
Can you explain?
| net := testutil.NewDefaultLongRunningNetwork(t, 10) | ||
| for i, instance := range net.Instances { | ||
| if i == 3 { | ||
| instance.SilenceExceptKeywords("WAL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we're silencing the logs. Is that to make the test less flaky or something?
And why aren't we silencing the WAL pattern here?
| l *TestLogger | ||
| t *testing.T | ||
| BB ControlledBlockBuilder | ||
| messageTypesSent map[string]uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're only incrementing the message types but never reading them... why do we need this?
| msg *simplex.Message | ||
| from simplex.NodeID | ||
| }, 1000)} | ||
| }, 100000)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the tests not work with the previous buffer size?
| } | ||
|
|
||
| amount := simplex.DefaultEmptyVoteRebroadcastTimeout / 5 | ||
| go n.UpdateTime(100*time.Millisecond, amount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UpdateTime iterates over instances without taking a lock, but we may re-create an instance via RestartNodes. Isn't it unsafe from a concurrency aspect?
| } | ||
|
|
||
| func (n *LongRunningInMemoryNetwork) RestartNodes(nodeIndexes ...uint64) { | ||
| for _, idx := range nodeIndexes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we need to stop the previous instance? I can't find where we're doing it.
| instance.Stop() | ||
| } | ||
|
|
||
| // // print summary of messages sent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need these commented out code lines?
This PR adds a functionality for a different style of testing. It allows the tester to spin up tests where they control the behavior of the network rather than the behavior of individual nodes. This framework can test Simplex in ways that would have otherwise been to tedious and cumbersome to test. For example this new simple test allows us to spin up a network of 10 nodes, wait for them to enter a specific round, disconnect a few of them, and then reconnect at a later time. At the end of the test, we assert that all nodes are properly functioning.
Before this would have required a tremendous amount of boilerplate code, and specific wherewithal to properly orchestrate replication, block building, etc..
This new struct wraps InMemNetwork, so all the previous functionality of InMemNetwork can still be used in these tests. However, we add an additional set of helper functions