Symbolic Execution Debugger (SED) - Tutorial
The Symbolic Execution Debugger (SED) is a language independent extension of the Eclipse debug platform for symbolic execution. By default SED comes with a symbolic execution engine based on KeY supporting sequential Java annotated with JML specifications. The use of the SED platform using KeY as symbolic execution engine is explained in detail in this tutorial.
The tutorial is structured as follows:
- Example Project
- Symbolic Execution and Basic Tool Usage
- Locate the Origin of a Bug
- Review Code and Specifications Usage
- KeY as Symbolic Execution Engine
Steps to perform are illustrated via screenshots (may differ from the latest release) and are surrounded by paragraphs providing details and background knowledge. For new users we recommend to follow all sections of this tutorial in the given order.
This section describes how to add the Symbolic Execution Debugger (SED) to an existing Eclipse installation. Eclipse Indigo (3.7) and newer are supported. The screenshots are created using Eclipse Mars (4.5).
This sections explains how to create the "SED Examples", an example Java project to try out SED (with KeY as symbolic execution engine). The following sections of this tutorial work with this project. Its content is organized as follows:
- src - The source code used to explain the functionality of the SED
- lib_specs - Stubs for Java API classes and methods referenced by the source code (required for symbolic execution with KeY). This folder is added to KeY's class path in the project properties page "KeY". For more details see section KeY as symbolic execution engine.
- Readme.txt - A file explaining the SED Examples
We use class
Number as a first example to explain
symbolic execution and the basic functionality of SED (using KeY as
symbolic execution engine). It has an instance field
content and method
equals, which compares the current
Number instance with the given one.
Symbolic execution means to execute a program with symbolic values in lieu of concrete values performed by a symbolic execution engine. Consequently, it is usually not possible to decide which execution path to take. Hence, execution splits into multiple branches to cover all feasible execution paths. The result is a so called symbolic execution tree.
The SED allows to start symbolic execution directly at any method or even to execute only selected statements.
View "Symbolic Debug" provides all views relevant for symbolic execution:
- Debug - Allows to switch between debug sessions and to control symbolic execution
- Symbolic Execution Tree - Visualizes the symbolic execution trees of selected debug sessions
- Symbolic Execution Tree (Thumbnail) - Miniature diagram for navigation purpose
- Variables - Shows the symbolic state of the selected node in the symbolic execution tree
- Breakpoints - Allows to manage breakpoints
- Symbolic Execution Settings - Allows to customize the symbolic execution engine
- Properties - Provides additional details about the selected node
Symbolic execution can be done stepwise or resumed until a breakpoint is hit. If no breakpoint is available or hit, symbolic execution continues until a maximal number of steps has been performed on each path. The limit is 1000 by default and can be changed on the preference page "Run/Debug, Symbolic Execution Debugger (SED), KeY".
Different node kinds are used to highlight the meaning of symbolic execution tree nodes. The root of a symbolic execution tree is always a start node representing the program fragment to execute.
The new selected node is a method call node indicating that method
equals will be inline executed next.
The selected branch statement node represents the program state before the first statement of method
equals is executed.
The SED separates between branch, loop and normal statements not supposed to branch.
Inspecting the state of the branch statement node in view "Variables" shows that at this point nothing is known about parameter
Thus it can be
null or a reference to an existing object.
If multiple values are possible the SED shows the condition under which each value would be taken.
this reference is here named
The instance field
n (if not
null) have symbolic values representing any number.
exc variable is used by KeY to separate between normal and exceptional termination.
It depends on the actual state how the
if guard is evaluated. The symbolic state does not restrict any of the possible outcomes.
Consequently, symbolic execution splits into several branches to capture all feasible execution paths.
Here, a first case distinction has to be made whether parameter
null or not.
Branch condition nodes show the condition under which each path is taken.
The right most branch where parameter
null terminates with an uncaught
NullPointerException as shown by the exceptional termination node.
The left branch splits again to distinguish the cases where the
content field of both instance has the same value or different values.
Symbolic execution is always continued on all not terminated child branches of the selected node.
Here, two method return nodes that show the return value of method
equals have been added.
If multiple return values are possible, all of them are shown together with the condition when each is returned.
The left most branch terminates normally without a thrown exception.
Resume continues execution until a breakpoint is hit on any branch or until the maximal number of performed steps per path is reached. Here, the last branch terminates also normally without a thrown exception.
SED supports the set of breakpoints provided by JDT (except class breakpoints) and extends them by the "KeY Watchpoint", which is a line independent state definition. In classical debugging using concrete execution, the user is responsible to start the program under inspection in an initial state, which leads to the state hitting a given breakpoint. Using symbolic execution all execution paths are explored simultaneously, in particular, execution paths which lead to a symbolic state hitting the breakpoint. No input values need to be provided!
Inspecting the symbolic state in view "Variables" shows that instance field
n has the same symbolic value.
This knowledge was discovered during symbolic execution by the branch conditions.
The path condition limiting the initial state in order to take the path is shown in view "Properties".
It is a conjunction over all branch conditions on that path.
Here, the path condition is satisfied, if
n are different objects with the same content.
But it is also true, if
n point to the same object.
Unintended aliasing is a source of bugs. SED helps to find these source of bugs by determining and visualizing all possible memory layouts w.r.t. the path condition.
The SED illustrates a memory layout as symbolic object diagram. Similar to a UML object diagram, a symbolic object diagram shows dependencies between objects, the values of object fields and the local variables of the current state.
Here, local variables
n point to two different objects.
The value of instance field
content is the same for both objects.
In the shown alternative memory layout the variables
n are aliases, i.e., they point to the same object.
Selecting initial state shows how the memory layout must have looked before the execution in order to end up in the current state. Here, the memory layouts for the initial and the current state are identical, because the executed code is side effect free, i.e. no location has been changed.
The second example shows how SED can be used to find the origin of a bug.
Mergesort (modified version of the algorithm presented at javabeginners.de) provides a buggy implementation of the Mergesort algorithm.
For our scenario, assume that we experienced a
StackOverflowError, while running a large application.
It was possible to reproduce the error with a few steps implemented by class
The stack trace shows that method
sortRange calls itself until the stack is full. Leveraging that using SED a debug session can start close to the suspected location of the bug, namely, at method
Symbolic execution starts in an arbitrary state.
NullPointerException is spurious as method parameter as
intArr should not be
Such knowledge can be provided as precondition in the debug configuration to restrict the initial state.
The explicit rendering of different control flow branches in the SE tree constitutes a major advantage over traditional debuggers. Unexpected or missing expected branches are good candidates for possible sources of bugs.
Here, we would actually expect that the branch statement node splits symbolic execution and are surprised that this is not the case.
This is suspicious and deserves closer attention.
if guard shows that the comparison should have been
l < r and the source of the bug is found.
Without the precondition the bug can be observed as well, but a little later.
The third example shows how specifications can be considered during symbolic execution. By default, method calls are treated by inlining possible method implementations and loops are unrolled. This causes in general symbolic execution trees of infinite depth. The SED continues symbolic execution until the maximal number of applied steps per path is reached.
Symbolic execution trees of finite depth can be guaranteed by applying method contracts and loop invariants instead of inlining and unrolling. The use of method contracts allows also to deal with methods for which no implementation is available.
indexOf of class
ArrayUtil returns the first index in the given
array accepted by the
Filter is no implementation available, but its method
accept is specified with JML. Its contract allows to call the method in any state that satisfies the precondition (here: true), and implicitly, in which the invariant (a property limiting states of an object) of the
Filter implementation is satisfied. The contract guarantees then that the method terminates without throwing an exception or changing any location.
The method contract of
indexOf requires that the method is called in a state where the invariant of
filter is fulfilled and ensures that no exception will be thrown. Method
indexOf invokes the
accept method on the instance given as
filter, using contracts KeY needs to ensure that at invocation time the method's precondition and the invariant of the
filter object are satisfied. To ensure this, the invariant for
filter must already be required to hold when invoking method
\invariant_for(filter) must be added as part of the precondition of method
indexOf. Note, in general this has to be done for all object parameters and fields on which a method is called.
while loop a loop invariant is provided, which ensures termination and limits the range of local variable
A loop invariant is a property which has to be preserved by each loop iteration. Thus, if the loop invariant is satisfied before the loop is entered, it is also satisfied after an arbitrary number of loop iterations when execution is continued after the loop.
Notice that the provided loop invariant is not strong enough for verification, but sufficient for debugging.
Debugging when providing a method contract differs from debugging a method or some statements without a contract for two reasons:
First, the precondition of the method contract is used to limit the initial state.
This includes also that JML defaults will be used.
In particular, fields are considered as not
null as long as not explicitly specified as
Second, the postcondition is proven for the termination nodes. If not satisfied, the termination node indicates this by a red crossed icon.
To apply method contracts and loop invariants during symbolic execution, the symbolic execution engine has to be configured before performing a step or resuming execution.
That a loop is treated by applying a loop invariant instead of unrolling it is shown by a loop invariant node. If the loop invariant is initially not valid the node will indicate it. Applying a loop invariant splits symbolic execution into two branches.
The "Body Preserves Invariant" branch represents an arbitrary loop iteration and stops when the method body is completely executed (shown by a Loop Body Termination node).
Here, we can see that the left most branch does not preserve the loop invariant.
The reason is that
i is not increased which is required by the
decreasing clause of the loop invariant.
The bug can be solved by moving the statement
i++ after the
Note that in case of a thrown exception or a jump outside the loop (
return) the loop invariant does not have to hold and execution is continued outside the loop.
The "Use Case" branch continues execution after the loop is exited.
Even if the loop invariant is not preserved, we can see in the method return nodes that the value of
i is returned instead of the found
KeY is a semi-automatic verification tool that proves the correctness of sequential Java programs (without garbage collection, dynamic class loading and floats) annotated with JML specifications. A symbolic execution engine based on KeY is developed as part of the SED supporting the same set of Java and JML features than KeY.
KeY and its symbolic execution engine operates on source code and not on compiled byte code. This requires to provide the source code of all used types or at least stubs of called methods. Method stubs can be specified with JML. KeY itself provides such stubs and specifications only for a very small part of the Java API.
The SED Examples make use of Java API classes like
PrintStream, which are not contained in the stubs provided by KeY.
Thus it is required to provide the necessary method stubs and to place them in the folder "lib_specs".
A folder containing such stubs should not be part of the class path used by Eclipse's JDT.
Instead it has to be added to KeY's class path defined in the project properties on page "KeY".
For the SED examples this is already done automatically by the wizard (see Example Project).
KeY separates between two different kinds of class paths:
- Boot class path - A single path providing source code and specifications of classes. Full method implementations will be considered. If not specified, the minimal fraction of KeY's specifications for Java API classes will be used.
- Class path - Multiple paths providing method stubs and specifications. Only method specifications will be considered and not method implementations.
The following steps explain how to check and modify KeY's class path:
KeY provides some general settings, so called "Taclet Options", which influence symbolic execution and proofs performed by KeY in general. Of special interest for symbolic execution are:
- JavaCard - Enable or disable support of JavaCard
- assertions - Treatment of Java assert statements
- bigint - Support for JML datatype
- intRules - Used integer semantic (mathematical (default) or overflow sensitive)
- runtimeExceptions - How to deal with runtime exceptions
The following steps explain how to change the "Taclet Options":