Key Moments

Assertion-based Repair of Complex Data Structures

Google TalksGoogle Talks
Education5 min read34 min video
Aug 22, 2012|114 views|2
Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

TL;DR

New technique repairs corrupted data structures by using assertions not just to detect but also to fix bugs, allowing programs to continue running instead of crashing.

Key Insights

1

Instead of terminating a program upon an assertion violation, the proposed method mutates the data structure to a valid state, allowing execution to continue.

2

Complex data structure integrity constraints can be grouped into structural characteristics (e.g., acyclicity) and data characteristics (e.g., ordered values).

3

Symbolic execution is used to repair data corruption by identifying conditions under which an assertion violation would not occur, while systematic search repairs structural issues.

4

The repair process involves backtracking to the last accessed field and trying non-deterministic mutations (e.g., setting to null, trying new nodes) until the assertion passes.

5

Experiments show the repair approach can handle large data structures with hundreds of thousands of nodes and multiple faults within seconds.

6

The core idea is to leverage existing assertion descriptions (specifications) for repair, reducing the need for separate, manually written repair routines.

Rethinking assertion violations from detection to repair

Traditionally, when an assertion in a program fails, signaling a data structure corruption, the standard procedure is to terminate the program, debug it, and then re-execute. This approach, however, is often insufficient, especially for persistent data like file systems. If the underlying data is corrupted, the same assertion violation will occur repeatedly, even after debugging. This presentation introduces a novel approach where, instead of halting execution, a violated assertion is used as a basis for repairing the data structure. The program's state is modified to satisfy the assertion, enabling continued execution and reporting the initial violation for later analysis.

Defining and identifying data structure integrity

Complex data structures, such as red-black trees, intentional naming networks, and software caches, are fundamental to many programs. These structures possess specific constraints that, when violated, can lead to program instability or crashes. These constraints can be broadly categorized into two types: structural integrity constraints, which define the form of the structure itself (e.g., a linked list must be acyclic), and data integrity constraints, which pertain to the values stored within the structure (e.g., elements in a red-black tree must be ordered). Assertions, often described using declarative languages like Alloy or Z, or imperative languages like Java, are used to check these properties.

The repair process: Mutating state to satisfy assertions

The core of the proposed repair mechanism involves taking a program state with a violated assertion and mutating it into a new state (S') where the assertion holds true (S'.repOK). This goal is not necessarily to restore the program's intended state, but rather to reach a state where the program can continue executing without crashing due to the underlying data structure corruption. The repair might not always yield the exact intended structure, as multiple valid repairs might exist for a single corruption. The repair focuses on correcting both faults within the structure's topology and faults in the data itself.

Repairing data corruption with symbolic execution

To address corruption in the data values within a structure, the technique of symbolic execution is employed. Unlike concrete execution which uses specific values, symbolic execution runs the program with symbolic values representing inputs. This allows the system to explore all possible program paths and generate path conditions. If a data corruption causes an assertion to fail, symbolic execution can identify the conditions under which the assertion would pass. The system can then generate corrected data values that satisfy these conditions, enabling the program to continue.

Repairing structural integrity with systematic search

For structural integrity constraints, a systematic search approach is used, inspired by techniques for test generation. The process begins by executing the 'repOK' method (which checks the structure's validity) on the current state. When 'repOK' returns false, indicating a fault, the system identifies the last field accessed before the failure. This field is then non-deterministically mutated. For data fields, this might involve changing the value to a symbolic variable. For reference fields, it could mean setting the reference to null or trying a previously known unvisited node. The 'repOK' method is then re-executed with the mutated structure until it returns true, signifying a repaired state.

Practical demonstrations and performance

Experiments were conducted by injecting faults into various data structures like doubly linked lists and software caches. When a fault triggered an exception in a software cache scenario, the unrepaired system would crash. With the repair routine, the system continued execution, logging the issue for later developer attention. Performance tests on data structures with hundreds of thousands of nodes, handling up to 20 faults, demonstrated repair times within 10 seconds for structures like intentional naming systems and file systems. While performance can degrade quadratically with a very high number of faults, the expectation is that significant structural issues are typically caught early.

Leveraging existing assertions for efficient repair

A key advantage of this approach is its reliance on existing 'repOK' methods or assertions, which often already exist as program specifications. This eliminates the need to write separate, manually crafted repair routines for each potential bug. The effort involved in writing assertions for these benchmarks was minimal, with each taking approximately 15-20 minutes. This contrasts with traditional fault tolerance methods that require explicit inspection and coding for every possible failure scenario.

Applications and future directions of assertion-based repair

Beyond direct bug repair, this assertion-based framework has applications in generating large, valid data structures for testing purposes. By providing a structure description and a 'repOK' method, the system can iteratively repair random structures until a valid one is generated. Another investigated application is automatically translating repair actions into actual code fixes, though this remains complex for intricate bugs. The framework also supports different assertion checking strategies, from infrequent checks to comprehensive checks with every garbage collection, depending on the required reliability.

Common Questions

Assertion-based repair is a technique that uses assertions to check the integrity of data structures. If a violation is detected, instead of crashing, the system attempts to repair the data structure to a valid state, allowing the program to continue execution.

Topics

Mentioned in this video

More from GoogleTalksArchive

View all 48 summaries

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free