Skip to content

Commit 674d16d

Browse files
Address Frisby's review (#56)
This PR addresses #55, including some of the proposed changes and other minor tweaks.
1 parent b4de020 commit 674d16d

File tree

1 file changed

+62
-50
lines changed

1 file changed

+62
-50
lines changed

docs/design.md

Lines changed: 62 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ prefix of the current chain. Up to now, this has been achieved in the
99
following ways:
1010

1111
- Every node in the network is running (close to) the same code.
12-
- Consensus testing (living inside [`io-sim`](https://hackage.haskell.org/package/io-sim),
12+
- Consensus testing (living inside [`io-sim`](https://hackage.haskell.org/package/io-sim),
1313
a Haskell IO simulator) validates that all (honest) nodes will eventually
1414
reach consensus.
1515

@@ -30,8 +30,8 @@ stack correctly.
3030

3131
Correctness here is an *extremely important* property---much more so than in
3232
most software projects. Nodes failing to agree on the correct chain risks an
33-
accidental hard fork. Should one persist long enough it might potentially be
34-
unrecoverable.
33+
accidental hard fork. Should one persist long enough the protocol might be
34+
unable to recover without external/etc intervention.
3535

3636
This document gives a design for a suite of tools, and the necessary
3737
infrastructure changes, to expose these existing tests in a form that
@@ -70,11 +70,11 @@ colluding) peers. After evaluation of the point schedule, the Node Under Test
7070
that the peer should send to the NUT. See the relevant documentation
7171
[here](https://github.com/IntersectMBO/ouroboros-consensus/blob/374ef153e20d83ad3d42d850ce560b67034ac578/ouroboros-consensus-diffusion/test/consensus-test/Test/Consensus/PointSchedule/SinglePeer.hs).
7272

73-
Whilst the point schedule currently is implemented inside the Haskell node, its
73+
Whilst the point schedule currently is implemented inside the Haskell node's test suite, its
7474
declarative nature makes it possible to export this testing method and make it
7575
usable across diverse node implementations. To ensure this, we will look only
7676
at the messages sent over the network, to ensure we are performing black-box
77-
testing. It will also be possible for alternate nodes to use peer simulation
77+
testing. It will also be possible for alternative nodes to use peer simulation
7878
for white-box testing in cases that depend on internal tracing (eg. file
7979
handles, memory usage, etc.). This suite of tools aims only at properties
8080
related to test conformance against the Ouroboros Praos consensus protocol.
@@ -114,18 +114,24 @@ We will ship three separate CLI tools:
114114
The purpose of `testgen` is to generate test cases; it accepts arguments to
115115
select a specific class of tests, and potentially some test-specific tuning knobs
116116
(to eg, change the "difficulty" of the test.) Each class of tests will have an
117-
associated `Gen`erator, which `testgen` will invoke to instantiate the test
118-
class. Its output will be a test file containing a point schedule and a
117+
associated Generator, which `testgen` will invoke to instantiate the test
118+
class. Its output will be a *test file* containing a point schedule and a
119119
(mechanical) description of the property which needs to pass.
120120

121-
The `runner` tool accepts a test file (as output by `testgen`) and
122-
a *shrink index* (see [shrinking](#shrinking)), and spins up simulated peers
123-
corresponding to the embedded point schedule. The `runner` tool will then
124-
output a [topology
121+
The `runner` tool takes a test file (as output by `testgen`) as mandatory
122+
argument, and an optional *shrink index* (see [shrinking](#shrinking)), to spin
123+
up simulated peers corresponding to the embedded point schedule. The `runner`
124+
tool will then output a
125+
[topology
125126
file](https://developers.cardano.org/docs/operate-a-stake-pool/node-operations/topology/)
126-
whose `localRoots` will point to the simulated peers. We will create an
127-
additional `localRoot` peer whose job is to record all messages diffused from
128-
the NUT.
127+
whose `localRoots` point to the simulated peers and an additional
128+
downstream peer whose sole job is to query the NUT state.[^query]
129+
130+
[^query]: Most properties in the referenced test suit depend exclusively on the
131+
final state of the NUT, so such a query could be made once after the point
132+
schedule completes. However, its plausible that querying the NUT state
133+
throughout the test run could allow implementing other properties if such
134+
repeated queries prove to be robust enough.
129135

130136
Alternative nodes which wish to test against `runner` need to parse the generated
131137
topology file and connect to the simulated peers. Once they have all been
@@ -146,29 +152,30 @@ state of the shrink index, we will perform different actions
146152
(see [exit-codes](#exit-codes).)
147153

148154
It is important to note that a single invocation of the composition `runner
149-
. testgen` correponds to a single unit test. Users are encouraged to run this
155+
. testgen` corresponds to *a single unit test*. Users are encouraged to run this
150156
composition in a loop, to gain the usual assurances given by property tests.
151157

152-
If a given test fails, and shrinking is desired, users can rerun `runner` with
153-
the outputted shrink index. The pair `(testfile, shrinkindex)` is the entirety
154-
of the shrink search state. By explicitly threading this state through flags
155-
and stdout, the system is stateless.
158+
If a given test fails, and (further) shrinking is desired, users can rerun
159+
`runner` with the outputted shrink index and the initial test file. The pair
160+
`(testfile, shrinkindex)` is the entirety of the shrink search state.
161+
By explicitly threading this state through flags and stdout, the system is
162+
stateless.
156163

157164
A basic testing workflow would be like follows:
158165

159-
1. The user obtains a point schedule from `testgen`.
160-
2. The user invokes our `runner` binary, passing in the point schedule as an
166+
1. The user obtains a test file with an embedded point schedule from `testgen`.
167+
2. The user invokes the `runner` binary, passing in the test file as an
161168
argument.
162169
3. `runner` starts up the simulated peers, and returns a topology file.
163170
4. The user starts its node with the given topology file.
164171
5. The node connects to our simulated peers.
165172
6. Once all of the peers have been connected to, the point schedule begins
166173
running.
167174
7. After the point schedule has finished, we observe the final state of the node.
168-
8. The server will exit with a return code (see [exit-codes](#exit-codes)) corresponding to
169-
whether or not the node ended in the correct state, producing either a
170-
shrink index for subsequent test run or a test file with a minimal counter
171-
example.
175+
8. The `runner` will exit with a return code (see [exit-codes](#exit-codes))
176+
corresponding to whether or not the node ended in the correct state,
177+
producing either a shrink index for subsequent test run or a test file with
178+
a minimal counterexample.
172179

173180
```mermaid
174181
---
@@ -218,7 +225,8 @@ sequenceDiagram
218225
The `shrinkView` tool accepts a test file and a shrink index, and outputs the
219226
test file corresponding to the given shrink index. This tool is primarily
220227
useful for looking at non-minimal test inputs, eg, when the user doesn't want
221-
to iterate the shrinking all the way down to a minimal example.
228+
to iterate the shrinking all the way down to a minimal counterexample or desires
229+
to keep file records of intermediate shrunk counterexamples.
222230

223231

224232
#### Supported Operations and Flags
@@ -235,17 +243,23 @@ The **test generator** CLI tool supports, at least, the following operations:
235243
- `meta` to access test class metadata, eg the number of `desired-passes`
236244
we expect to run a test for.
237245

238-
The **test runner** CLI tool supports the following optional flags:
246+
The **test runner** CLI tool takes a single test file as mandatory argument and
247+
supports the following optional flags:
239248

240249
- `--shrink-index` to specify the shrunk point schedule to run, according to the
241250
given index. The index values are output by the `runner` itself.
242251
- `--topology-file` specifies the output path for the topology file.
243-
- `--minimal-test-output` specifies the file path to write the current point
244-
schedule if no further shrinking is possible.
252+
- `--minimal-test-output` specifies a file path to write the last counterexample
253+
test file to if no further shrinking is possible. This is the only case
254+
`runner` outputs a test file. Non-minimal counterexample test files can be
255+
produced by using `shrinkview`.
256+
257+
The **shrink viewer** tool has two mandatory arguments (a test file and a shrink
258+
index), a single mode of operation, and no flags.
245259

246260
These operations provide the primitives needed to orchestrate a QuickCheck-like
247261
workflow. For example, users are free to run the entire test suite by looping
248-
over `testgen list-classes`.
262+
over the output of `testgen list-classes`.
249263

250264

251265
#### Exit Codes
@@ -272,7 +286,7 @@ standards.
272286

273287
If the property succeeded, but the shrink index was non-`empty`, we will exit
274288
with code `CONTINUE_SHRINKING`. In addition, we will output the result of `succ
275-
shrinkIndex` on stdout. While this is technically a test pass, it is a pass for
289+
shrinkIndex` (the successive shrink branch) on stdout. While this is technically a test pass, it is a pass for
276290
a shrunk input. Thus this is merely a "local" success, rather than a "global"
277291
success.
278292

@@ -283,9 +297,9 @@ on stdout. This corresponds to a non-minimal test failure.
283297
If the property failed and we cannot `extend` the shrink index, we will exit
284298
with code `TEST_FAILED`, and produce the minimal test case on stdout.
285299

286-
When the `CONTINUE_SHRINKING` bit is part of the exit code, the user is
287-
encouraged to restart the `runner` with the new shrink index, in order to
288-
manually "pump" the shrinker.
300+
When the `CONTINUE_SHRINKING` bit is set in the exit code, the user can
301+
rerun the `runner` with the new shrink index, in order to continuing searching
302+
for a smaller counter examples.
289303

290304

291305
## Alternatives
@@ -302,8 +316,9 @@ preclude the possibility of implementing this.
302316
## Unresolved Questions
303317

304318
* Do we need a separate peer to act as our state observer? Maybe not, but it's
305-
conceptually clearer to have a peer whose sole job is to collect data. As
306-
a counterexample, what happens when the peer schedule is empty?
319+
conceptually clearer to have a peer whose sole job is to collect data.
320+
For example, in the case of an empty peer schedule it is clear that the
321+
downstream peer would still get the correct state.
307322

308323
## Implementation Plan
309324

@@ -337,9 +352,9 @@ allTheTests :: TestSuite ConsensusTest
337352
runConsensusTest :: ConsensusTest -> Property
338353
```
339354

340-
The change to `cardano-node` would be minimal, and it essentially boils
355+
The change to `cardano-node`'s test suite would be minimal, and it essentially boils
341356
down to implementing `runConsensusTest` using `forAllGenesisTest`, which should
342-
have no local effect on the implementation. Along this lines,
357+
have no local effect on the implementation. Along these lines,
343358
`toTasty :: TestSuite ConsensusTest -> TestTree` would essentially traverse the
344359
`TestSuite` using `runConsensusTest`.
345360

@@ -377,11 +392,11 @@ consistently tested during development.
377392
In this milestone, we will deliver a minimal implementation of `runner`,
378393
supporting:
379394

380-
- (mindless, derived) parsing of point schedule test files
395+
- (bare-bones) parsing of point schedule test files
381396
- generating topology files
382397
- running point schedules
383398
- observing the state of the NUT
384-
- returning an appropriate error code of among `SUCCESS`, `INTERNAL_ERROR`,
399+
- returning an appropriate error code among `SUCCESS`, `INTERNAL_ERROR`,
385400
`BAD_USAGE` or `TEST_FAILED`
386401

387402
We explicitly *will not* implement anything around generators or shrinking in
@@ -397,7 +412,7 @@ does what it ought.
397412
Provide a useful failure feedback, so that users can leverage the test
398413
to find specific bugs on their consensus protocol implementation.
399414

400-
Deliver a shrinking strategy to run our executable from [1] on subsequently
415+
Deliver a shrinking strategy to run our executable from Milestone 1 on subsequently
401416
smaller tests.
402417

403418

@@ -431,11 +446,6 @@ At this point, the MVP is working as expected on the `cardano-node`; our goal
431446
now is to ensure this works for other implementations, and that our design is
432447
decoupled from any internal details of `cardano-node`.
433448

434-
As a stretch goal: it would be very rewarding to actually find a test failure in
435-
another implementation. Care must be taken to ensure that this is, however, a
436-
failure in the node, and not some artifact of our testing procedure.
437-
438-
439449
#### Deliverables
440450

441451
In this milestone, our primary deliverable will be to successfully test a point
@@ -456,12 +466,14 @@ on alternative nodes to actually use any of this testing machinery.
456466
#### Questions to Answer
457467

458468
Do we need to simulate time? This might be related to configuration access to
459-
node timeouts (as Network delays would be irrelevant in this setting).
469+
node timeouts (as network latency would be irrelevant in this setting).
460470

471+
In this milestone, we will consult with Amaru architects on what concerns they
472+
have on satisfying the proposed interface.
461473

462474
#### Risks
463475

464-
With the current lineup, our team has minimal knowledge of Rust. Depending of
476+
With current staffing, our team has minimal knowledge of Rust. Depending of
465477
the difficulty of making the necessary changes, we might need to pull in
466478
external help.
467479

@@ -475,7 +487,7 @@ the approach official. We will refactor the existing test suite into a reified
475487
[`TestSuite`](#testsuite-anchor), from which we can extract both the existing
476488
`tasty` test suite, as well as the data for `testgen`.
477489

478-
This step will require patching `ouroboros-consensus`, which is why we want to
490+
This step will require patching `ouroboros-consensus`'s test suite, which is why we want to
479491
have proven the technology before making upstream changes.
480492

481493

@@ -486,7 +498,7 @@ In this milestone, we will deliver:
486498
1. a design and specification of the serialization format for our test files
487499
2. corresponding changes to `runner` and `shrinkview` for parsing and
488500
serializing these files
489-
3. we will port all of the existing ouroboros-consensus tests into a reified
501+
3. we will port all of the existing `ouroboros-consensus` tests into a reified
490502
[`TestSuite`](#testsuite-anchor) representation.
491503

492504
In addition, we will deliver the `testgen` utility, including:

0 commit comments

Comments
 (0)