Software Correctness and Verification

Reasoning about programs we write helps to determine whether they are correct with respect to their expected behaviour. We are used to reason about programs when we write them and associated tests. This reasoning is operational often following single executions of a program in a way comparable to how we trace values. The method outlined in the preceding two sections describes also a symbolic approach where facts are traced through programs. This approach does not consider single executions but all executions of a program at once. We discuss software correctness from the two perspectives of the operational and the symbolic approach. To show that a program is correct

from the operational perspective we use testing;
from the symbolic perspective we use proof.

The two perspectives –and with them testing and proof– are tightly related and we make ample use of this relationship.

Testing a Simple Fragment (Version 1)

Knowing about the relationship between values and facts we can formulate a simple testing method for program fragments. Of course, we need to refine this later on to deal with proper Slang programs, but the principle remains the same. The fragments we have looked at above have the following general shape, consisting of three parts

initialise variables
carry out computation
check condition

The initialise variables part sets up the input values for the fragment. Usually, the input values are chosen taking into account conditions concerning input values, e.g., to avoid division by zero. The carry out computation part contains the “program”. The check condition part specifies a condition to determine whether the program is correct. Correctness is discussed into the following section.

Abstracting from Values and Computations

As we have developed a concept of reasoning about program fragments, we have gradually moved away from considering concrete values and computations, that is, from what a computer does. And this is even useful for testing which is all about what a computer does!

In order to complete the abstraction from values and computations we are still lacking a means to describe assumptions about values at the beginning of a fragment. The corresponding construct is called assume. It specifies a fact that can be assumed to be true at the locations where it is written. We can imagine assume to have the following effect during program execution: if the condition assumed is true, assume does nothing; if the condition is false, assume terminates the program “gently”, that is, it is not considered an error. Contrast this to the assert statement, that aborts with an error when its condition is false. Now, assume is not very useful for programming because we are interested in error conditions that may be encountered during program execution, but assume is very useful for reasoning about program fragments.

Consider the following fragment involving a devision.

val x: Z = randomInt()
assume(x >= 5)
val y: Z = 5 / x
assert(y <= 1)

As long as we know that the fact x >= 5 holds at the beginning we can be certain that at the end y <= 1 holds. The use of assume will turn out to be a powerful tool for drafting program fragments and also for determining test cases. In fact, in conjunction with assert it is so useful, that a pair consisting of an assume statement followed by an assert statement is given its own terminology. We call such a pair a contract. We will see contracts in different shapes. Special syntax is used for different purposes, for instance, when specifying contracts for functions. Note that if a program fragment with an assume statement was “called” and we would like that it does not terminate gently, we would have to show that the condition of the assume statement is met. We will come back to this point later.

Testing a Simple Fragment (Version 2)

Instead of giving an initialistion as in Version 1, we can also use an assert statement to impose an initial condition for a test. Using the concept of a contract, we can specify,

assume initial condition on variables
specify computation
assert final condition on variables

The fragment terminates gently if the initial condition is not met and aborts with an error if the initial condition was met but the final condition is is not. This way of specifying tests turns out to be a foundation for deriving test cases according to Version 1. Reasoning about contracts we can systematically develop test cases.

Program Correctness

Following the preceding discussion we base our notion of correctness on contracts. We consider two variants in which contracts can be specified:

Pairs of assume-assert-statements in program fragments and tests. These contracts are executable and can be evaluated at run-time.
Pairs of requires-ensures-clauses used for specifying functions. These contracts are not executable. They are exclusively used for proving that functions satisfy their contracts. By contrast, assume and assert can be used for proving properties and for their runtime checking. Properties specified using requires and ensures are more expressive, permitting statements over infinite value ranges, for instance, which cannot be avaluated at run-time. We introduce function contracts in the next chapter.

We call the first component of a contract, its assume-statement or requires-clause, a pre-condition. We call its second component, its assert-statement or ensures-clause, a post-condition.

We say a program (or function) is correct if it satisfies its contract: Starting in a state satisfying the pre-condition, it terminates in a state satisfying its post-condition.

Program Verification

To demonstrate that a program is correct we verify it. We consider two princples methods for verifying programs.

Proof

Using logical deduction we show that any execution of a program starting in a state satisfying the pre-condition, it terminates in a state satisfying its post-condition. In other words, we show that the program is correct.

Testing

Executing a program for specific states satisfying the pre-condition, we check whether on termination a state is reached that satisfies the post-condition. It is up to us to determine suitable pairs of states, called test cases. This approach does not show that a program is correct. In practice, we conjecture that programs that have been subjected to a sufficient number of tests is correct. This is kind of reasoning is called induction: from a collection of tests that confirm correctness for precisely those tests, we infer that this is the case for possible tests. Testing is a fallibe verification method: it is entirely possible that all tests that we have provided appear to confirm correctness, but later we find a test case that refutes the conjecture. Either the program contains an error or the test case is wrong.

Methodology

We treat proof and testing as complementary verification approaches for software development. There are numerous points where the two approaches interact:

The proof approach appears powerful at first. But it can be very costly to apply. It requires a high effort, and it does this already before we know a program to be correct. However, if a program contains errors, testing might reveal them early.
Testing can be used to provide counterexamples: If a proof of a program does not succeed and we have a hypothesis about the cause, we can formulate a test case to attempt to confirm or refute the hypothesis.
Comparable to scientific study, we can use deductive reasoning to formulate a theory about a program (to be developed) and combine it with experimentation –in our terminology testing– to confirm the theory. In this reading, testing plays an essential role in understanding the programs we write.
Some properties, when proved, can have an impact on the performance of programs and allow testing to focus on properties related to requirements. A typical example of this kind is type checking. Type checking a program consitutes a proof that a values occuring in program are only used with operators for which they are defined. So, this does not need to be tested or checked at run-time. Other examples are null-pointers or array indices: If it can be proved that no null-pointer is dereferenced or arrays are only indexed within valid ranges, this does not be tested or checked at run-time either. The difference between these two is that null-pointers can be dealt with by typing, whereas for for array indexing this is not possible. For the latter we have to use such deductive techniques as those that we explore.
The development of larger software components may require a large number of test cases to be developed alongside. Many of those test cases are are not challenging but require effort for writing and maintaining. If contracts are avaible for corresponding functions, test automation is possible. Using the contracts test cases can be extracted and integrated in unit tests automatically. As a consequence, testing effort can focus on challenging properties of software, typically those more closely related to requirements, where contracts might be difficult to obtain because the requirements are more anekdotal, scenario-like and open-ended.