From d26359dc49e5b5b12d5f3dfd8140395faef4fc98 Mon Sep 17 00:00:00 2001 From: Michael Tautschnig Date: Tue, 18 Feb 2025 13:54:47 +0000 Subject: [PATCH] Document demonic non-determinism This PR adds developer-facing documentation to clarify what we mean by demonic non-determinism, and why we use it. --- docs/src/SUMMARY.md | 3 +- docs/src/demonic-nondet.md | 80 +++++++++++++++++++++++++++++++++++ docs/src/dev-documentation.md | 7 ++- 3 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 docs/src/demonic-nondet.md diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index 8cd6a1ebb7b0..19fff651d067 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -30,11 +30,12 @@ - [Developer documentation](dev-documentation.md) - [Coding conventions](./conventions.md) + - [Command cheat sheets](./cheat-sheets.md) - [Working with CBMC](./cbmc-hacks.md) - [Working with `rustc`](./rustc-hacks.md) - [Migrating to StableMIR](./stable-mir.md) - - [Command cheat sheets](./cheat-sheets.md) - [cargo kani assess](./dev-assess.md) + - [Demonic non-determinism](./demonic-nondet.md) - [Testing](./testing.md) - [Regression testing](./regression-testing.md) - [(Experimental) Testing with a Large Number of Repositories](./repo-crawl.md) diff --git a/docs/src/demonic-nondet.md b/docs/src/demonic-nondet.md new file mode 100644 index 000000000000..c856b5ecffe8 --- /dev/null +++ b/docs/src/demonic-nondet.md @@ -0,0 +1,80 @@ +# Demonic non-determinism in CBMC and Kani + +CBMC and Kani use demonic non-determinism to more efficiently prove certain +classes of program properties. +This document describes what we mean by "demonic non-determinism," how and why +CBMC and Kani use it, and what challenges arise as a consequence thereof. + +## What is demonic non-determinism? + +When we seek to prove that a property holds for all possible values of some +object then a counterexample demonstrates the existence of at least one such +value that violates the property. +Other properties range across all objects or expressions of a particular kind, +e.g., we want to prove that for all dereference expressions the underlying +pointer is valid. +For a counterexample to this property it suffices to find one pointer that +cannot be safely dereferenced. +We can use this fact to non-deterministically choose to only track a single +pointer across an execution via expressions of the form +`global_pointer = if nondet { new_pointer_to_track } else { global_pointer };`. +We can now use `global_pointer` in describing properties such that the property +fails if `new_pointer_to_track` is invalid and `global_pointer` equals +`new_pointer_to_track`. +Note that this is an "if" statement, not "if, and only if:" there may also be +some other invalid pointer `other_pointer` that is invalid at the time of +dereference, and the back-end solver may choose to resolve the non-determinism +such that the property fails because `global_pointer` equals `other_pointer`. +We call such use of non-determinism _demonic non-determinism_, because the +SAT or SMT-solver will resolve the non-deterministic choice such that a +counterexample is reported _if_ there exists a model such that the overall +property fails. + +## Why use demonic non-determinism? + +For the above example, one way to avoid the non-determinism is to use an array +or map to track all pointers rather than non-deterministically choosing to track +a single one. +The encoding thereof, however, may be much more expensive. See +[cbmc#6506](https://github.com/diffblue/cbmc/pull/6506) for the performance +penalty for such a change from non-determinism to arrays. + +## What are disadvantages of demonic non-determinism? + +The consequence of this choice of encoding is that any given model provided by +the back-end SAT/SMT solver can only encode a violation of one property of the +kind that the demonic non-determinism is used for, even if multiple properties +can be violated along a single execution path. +When we now look at assumptions the problem is that these cannot accomplish what +we’d perhaps expect to get out of them: the `assume` statement with a condition of +"single global variable does not have the same value as the pointer to be +dereferenced" does not guarantee that the subsequent dereference is safe with +regard to objects having gone out of scope: the solver can choose to pick +non-deterministic values such that the address of the object having gone out of +scope has never been assigned. + +As such, demonic non-determinism poses a problem in context of modular +verification, where we need to assume that the given preconditions hold. +Such assumed preconditions would not be preceded by an assertion, and we would +effectively assume that none of the pointers involved in such preconditions were +being tracked by the solver. + +## Where do CBMC and Kani use demonic non-determinism? + +One way we use this today is tracking objects that go out of scope. +We use a single, global variable (initialized to the null pointer) to store the +address of objects going out of scope. +Upon a pointer dereference we then check whether the pointer being dereferenced +has the same (address) value as the value held in that single global variable. +If that is true then we know a dead object (one that has gone out of scope) is +being accessed, amounting to undefined behavior, and we fail that property check. + +This setup can obviously only track a single object (the one that has gone out +of scope most recently). +By adding non-determinism when assigning the address to that single global +variable (`global-ptr = nondet ? global-ptr : &object-leaving-scope;`) we leave +the choice of which object to track to the solver. +This is sound even in presence of multiple objects (and, possibly, multiple +invalid pointers), because the solver will +return at least one counterexample if there is any undefined behavior, and can +indeed be forced to return all such counterexamples. diff --git a/docs/src/dev-documentation.md b/docs/src/dev-documentation.md index c3fe857cf0b6..ef806e4d59b1 100644 --- a/docs/src/dev-documentation.md +++ b/docs/src/dev-documentation.md @@ -13,8 +13,11 @@ developers (including external contributors): 2. [Useful command-line instructions for Kani/CBMC/Git](./cheat-sheets.md). 3. [Development setup recommendations for working with `cbmc`](./cbmc-hacks.md). 4. [Development setup recommendations for working with `rustc`](./rustc-hacks.md). - 5. [Guide for testing in Kani](./testing.md). - 6. [Transition to StableMIR](./stable-mir.md). + 5. [Transition to StableMIR](./stable-mir.md). + 6. [Using cargo kani assess](./dev-assess.md). + 7. [Demonic non-determinism](./demonic-nondet.md) + 8. [Guide for testing in Kani](./testing.md). + 9. [Performance comparison for Kani](./performance-comparisons.md). > **NOTE**: The developer documentation is intended for Kani developers and not users. At present, the project is under heavy development and some items