@@ -9,7 +9,7 @@ prefix of the current chain. Up to now, this has been achieved in the
99following ways:
1010
1111- Every node in the network is running (close to) the same code.
12- - Consensus testing (living inside [ ` io-sim ` ] ( https://hackage.haskell.org/package/io-sim ) ,
12+ - Consensus testing (living inside [ ` io-sim ` ] ( https://hackage.haskell.org/package/io-sim ) ,
1313 a Haskell IO simulator) validates that all (honest) nodes will eventually
1414 reach consensus.
1515
@@ -30,8 +30,8 @@ stack correctly.
3030
3131Correctness here is an * extremely important* property---much more so than in
3232most software projects. Nodes failing to agree on the correct chain risks an
33- accidental hard fork. Should one persist long enough it might potentially be
34- unrecoverable .
33+ accidental hard fork. Should one persist long enough the protocol might be
34+ unable to recover without external/etc intervention .
3535
3636This document gives a design for a suite of tools, and the necessary
3737infrastructure changes, to expose these existing tests in a form that
@@ -70,11 +70,11 @@ colluding) peers. After evaluation of the point schedule, the Node Under Test
7070 that the peer should send to the NUT. See the relevant documentation
7171 [ here] ( https://github.com/IntersectMBO/ouroboros-consensus/blob/374ef153e20d83ad3d42d850ce560b67034ac578/ouroboros-consensus-diffusion/test/consensus-test/Test/Consensus/PointSchedule/SinglePeer.hs ) .
7272
73- Whilst the point schedule currently is implemented inside the Haskell node, its
73+ Whilst the point schedule currently is implemented inside the Haskell node's test suite , its
7474declarative nature makes it possible to export this testing method and make it
7575usable across diverse node implementations. To ensure this, we will look only
7676at the messages sent over the network, to ensure we are performing black-box
77- testing. It will also be possible for alternate nodes to use peer simulation
77+ testing. It will also be possible for alternative nodes to use peer simulation
7878for white-box testing in cases that depend on internal tracing (eg. file
7979handles, memory usage, etc.). This suite of tools aims only at properties
8080related to test conformance against the Ouroboros Praos consensus protocol.
@@ -114,18 +114,24 @@ We will ship three separate CLI tools:
114114The purpose of ` testgen ` is to generate test cases; it accepts arguments to
115115select a specific class of tests, and potentially some test-specific tuning knobs
116116(to eg, change the "difficulty" of the test.) Each class of tests will have an
117- associated ` Gen ` erator , which ` testgen ` will invoke to instantiate the test
118- class. Its output will be a test file containing a point schedule and a
117+ associated Generator , which ` testgen ` will invoke to instantiate the test
118+ class. Its output will be a * test file* containing a point schedule and a
119119(mechanical) description of the property which needs to pass.
120120
121- The ` runner ` tool accepts a test file (as output by ` testgen ` ) and
122- a * shrink index* (see [ shrinking] ( #shrinking ) ), and spins up simulated peers
123- corresponding to the embedded point schedule. The ` runner ` tool will then
124- output a [ topology
121+ The ` runner ` tool takes a test file (as output by ` testgen ` ) as mandatory
122+ argument, and an optional * shrink index* (see [ shrinking] ( #shrinking ) ), to spin
123+ up simulated peers corresponding to the embedded point schedule. The ` runner `
124+ tool will then output a
125+ [ topology
125126file] ( https://developers.cardano.org/docs/operate-a-stake-pool/node-operations/topology/ )
126- whose ` localRoots ` will point to the simulated peers. We will create an
127- additional ` localRoot ` peer whose job is to record all messages diffused from
128- the NUT.
127+ whose ` localRoots ` point to the simulated peers and an additional
128+ downstream peer whose sole job is to query the NUT state.[ ^ query ]
129+
130+ [ ^ query ] : Most properties in the referenced test suit depend exclusively on the
131+ final state of the NUT, so such a query could be made once after the point
132+ schedule completes. However, its plausible that querying the NUT state
133+ throughout the test run could allow implementing other properties if such
134+ repeated queries prove to be robust enough.
129135
130136Alternative nodes which wish to test against ` runner ` need to parse the generated
131137topology file and connect to the simulated peers. Once they have all been
@@ -146,29 +152,30 @@ state of the shrink index, we will perform different actions
146152(see [ exit-codes] ( #exit-codes ) .)
147153
148154It is important to note that a single invocation of the composition `runner
149- . testgen` correponds to a single unit test. Users are encouraged to run this
155+ . testgen` corresponds to * a single unit test* . Users are encouraged to run this
150156composition in a loop, to gain the usual assurances given by property tests.
151157
152- If a given test fails, and shrinking is desired, users can rerun ` runner ` with
153- the outputted shrink index. The pair ` (testfile, shrinkindex) ` is the entirety
154- of the shrink search state. By explicitly threading this state through flags
155- and stdout, the system is stateless.
158+ If a given test fails, and (further) shrinking is desired, users can rerun
159+ ` runner ` with the outputted shrink index and the initial test file. The pair
160+ ` (testfile, shrinkindex) ` is the entirety of the shrink search state.
161+ By explicitly threading this state through flags and stdout, the system is
162+ stateless.
156163
157164A basic testing workflow would be like follows:
158165
159- 1 . The user obtains a point schedule from ` testgen ` .
160- 2 . The user invokes our ` runner ` binary, passing in the point schedule as an
166+ 1 . The user obtains a test file with an embedded point schedule from ` testgen ` .
167+ 2 . The user invokes the ` runner ` binary, passing in the test file as an
161168 argument.
1621693 . ` runner ` starts up the simulated peers, and returns a topology file.
1631704 . The user starts its node with the given topology file.
1641715 . The node connects to our simulated peers.
1651726 . Once all of the peers have been connected to, the point schedule begins
166173 running.
1671747 . After the point schedule has finished, we observe the final state of the node.
168- 8 . The server will exit with a return code (see [ exit-codes] ( #exit-codes ) ) corresponding to
169- whether or not the node ended in the correct state, producing either a
170- shrink index for subsequent test run or a test file with a minimal counter
171- example .
175+ 8 . The ` runner ` will exit with a return code (see [ exit-codes] ( #exit-codes ) )
176+ corresponding to whether or not the node ended in the correct state,
177+ producing either a shrink index for subsequent test run or a test file with
178+ a minimal counterexample .
172179
173180``` mermaid
174181---
@@ -218,7 +225,8 @@ sequenceDiagram
218225The ` shrinkView ` tool accepts a test file and a shrink index, and outputs the
219226test file corresponding to the given shrink index. This tool is primarily
220227useful for looking at non-minimal test inputs, eg, when the user doesn't want
221- to iterate the shrinking all the way down to a minimal example.
228+ to iterate the shrinking all the way down to a minimal counterexample or desires
229+ to keep file records of intermediate shrunk counterexamples.
222230
223231
224232#### Supported Operations and Flags
@@ -235,17 +243,23 @@ The **test generator** CLI tool supports, at least, the following operations:
235243- ` meta ` to access test class metadata, eg the number of ` desired-passes `
236244 we expect to run a test for.
237245
238- The ** test runner** CLI tool supports the following optional flags:
246+ The ** test runner** CLI tool takes a single test file as mandatory argument and
247+ supports the following optional flags:
239248
240249- ` --shrink-index ` to specify the shrunk point schedule to run, according to the
241250 given index. The index values are output by the ` runner ` itself.
242251- ` --topology-file ` specifies the output path for the topology file.
243- - ` --minimal-test-output ` specifies the file path to write the current point
244- schedule if no further shrinking is possible.
252+ - ` --minimal-test-output ` specifies a file path to write the last counterexample
253+ test file to if no further shrinking is possible. This is the only case
254+ ` runner ` outputs a test file. Non-minimal counterexample test files can be
255+ produced by using ` shrinkview ` .
256+
257+ The ** shrink viewer** tool has two mandatory arguments (a test file and a shrink
258+ index), a single mode of operation, and no flags.
245259
246260These operations provide the primitives needed to orchestrate a QuickCheck-like
247261workflow. For example, users are free to run the entire test suite by looping
248- over ` testgen list-classes ` .
262+ over the output of ` testgen list-classes ` .
249263
250264
251265#### Exit Codes
@@ -272,7 +286,7 @@ standards.
272286
273287If the property succeeded, but the shrink index was non-` empty ` , we will exit
274288with code ` CONTINUE_SHRINKING ` . In addition, we will output the result of `succ
275- shrinkIndex` on stdout. While this is technically a test pass, it is a pass for
289+ shrinkIndex` (the successive shrink branch) on stdout. While this is technically a test pass, it is a pass for
276290a shrunk input. Thus this is merely a "local" success, rather than a "global"
277291success.
278292
@@ -283,9 +297,9 @@ on stdout. This corresponds to a non-minimal test failure.
283297If the property failed and we cannot ` extend ` the shrink index, we will exit
284298with code ` TEST_FAILED ` , and produce the minimal test case on stdout.
285299
286- When the ` CONTINUE_SHRINKING ` bit is part of the exit code, the user is
287- encouraged to restart the ` runner ` with the new shrink index, in order to
288- manually "pump" the shrinker .
300+ When the ` CONTINUE_SHRINKING ` bit is set in the exit code, the user can
301+ rerun the ` runner ` with the new shrink index, in order to continuing searching
302+ for a smaller counter examples .
289303
290304
291305## Alternatives
@@ -302,8 +316,9 @@ preclude the possibility of implementing this.
302316## Unresolved Questions
303317
304318* Do we need a separate peer to act as our state observer? Maybe not, but it's
305- conceptually clearer to have a peer whose sole job is to collect data. As
306- a counterexample, what happens when the peer schedule is empty?
319+ conceptually clearer to have a peer whose sole job is to collect data.
320+ For example, in the case of an empty peer schedule it is clear that the
321+ downstream peer would still get the correct state.
307322
308323## Implementation Plan
309324
@@ -337,9 +352,9 @@ allTheTests :: TestSuite ConsensusTest
337352runConsensusTest :: ConsensusTest -> Property
338353```
339354
340- The change to `cardano- node` would be minimal, and it essentially boils
355+ The change to `cardano- node`'s test suite would be minimal, and it essentially boils
341356down to implementing `runConsensusTest` using `forAllGenesisTest` , which should
342- have no local effect on the implementation. Along this lines ,
357+ have no local effect on the implementation. Along these lines ,
343358`toTasty :: TestSuite ConsensusTest -> TestTree ` would essentially traverse the
344359`TestSuite ` using `runConsensusTest` .
345360
@@ -377,11 +392,11 @@ consistently tested during development.
377392In this milestone, we will deliver a minimal implementation of `runner` ,
378393supporting:
379394
380- - (mindless, derived ) parsing of point schedule test files
395+ - (bare - bones ) parsing of point schedule test files
381396- generating topology files
382397- running point schedules
383398- observing the state of the NUT
384- - returning an appropriate error code of among `SUCCESS `, `INTERNAL_ERROR `,
399+ - returning an appropriate error code among `SUCCESS `, `INTERNAL_ERROR `,
385400 `BAD_USAGE ` or `TEST_FAILED `
386401
387402We explicitly * will not * implement anything around generators or shrinking in
@@ -397,7 +412,7 @@ does what it ought.
397412Provide a useful failure feedback, so that users can leverage the test
398413to find specific bugs on their consensus protocol implementation.
399414
400- Deliver a shrinking strategy to run our executable from [ 1 ] on subsequently
415+ Deliver a shrinking strategy to run our executable from Milestone 1 on subsequently
401416smaller tests.
402417
403418
@@ -431,11 +446,6 @@ At this point, the MVP is working as expected on the `cardano-node`; our goal
431446now is to ensure this works for other implementations, and that our design is
432447decoupled from any internal details of `cardano- node`.
433448
434- As a stretch goal: it would be very rewarding to actually find a test failure in
435- another implementation. Care must be taken to ensure that this is, however, a
436- failure in the node, and not some artifact of our testing procedure.
437-
438-
439449#### Deliverables
440450
441451In this milestone, our primary deliverable will be to successfully test a point
@@ -456,12 +466,14 @@ on alternative nodes to actually use any of this testing machinery.
456466#### Questions to Answer
457467
458468Do we need to simulate time? This might be related to configuration access to
459- node timeouts (as Network delays would be irrelevant in this setting).
469+ node timeouts (as network latency would be irrelevant in this setting).
460470
471+ In this milestone, we will consult with Amaru architects on what concerns they
472+ have on satisfying the proposed interface.
461473
462474#### Risks
463475
464- With the current lineup , our team has minimal knowledge of Rust . Depending of
476+ With current staffing , our team has minimal knowledge of Rust . Depending of
465477the difficulty of making the necessary changes, we might need to pull in
466478external help.
467479
@@ -475,7 +487,7 @@ the approach official. We will refactor the existing test suite into a reified
475487[`TestSuite `](# testsuite- anchor), from which we can extract both the existing
476488`tasty` test suite, as well as the data for `testgen` .
477489
478- This step will require patching `ouroboros- consensus`, which is why we want to
490+ This step will require patching `ouroboros- consensus`'s test suite , which is why we want to
479491have proven the technology before making upstream changes.
480492
481493
@@ -486,7 +498,7 @@ In this milestone, we will deliver:
4864981 . a design and specification of the serialization format for our test files
4874992 . corresponding changes to `runner` and `shrinkview` for parsing and
488500 serializing these files
489- 3 . we will port all of the existing ouroboros- consensus tests into a reified
501+ 3 . we will port all of the existing ` ouroboros- consensus` tests into a reified
490502 [`TestSuite `](# testsuite- anchor) representation.
491503
492504In addition, we will deliver the `testgen` utility, including:
0 commit comments