Non-intrusive code coverage for safety-critical software

November 1, 2011 OpenSystems Media

4Using traditional tools to verify that structural coverage meets safety-critical standards can be complicated since the code being tested is not the same code that will finally execute. A tool that derives precise source-level coverage metrics from execution trace data for a noninstrumented program offers a simpler alternative that supports all levels of safety certification.

Certification standards such as DO-178B for commercial avionics require evidence that the system source code is completely exercised by tests derived from requirements. Traditional tools obtain the coverage data through code instrumentation, but this complicates analysis since the code being tested is not the code that will finally execute.

A host-resident two-part technology offers an efficient and cost-effective alternative solution: a target emulator coupled with a non-intrusive coverage analyzer. The emulator is not an interpreter; instead, it dynamically translates the object code into native host instructions. As a result, test suites typically execute faster than on the actual target hardware. The coverage analyzer derives source coverage data from object branch information retrieved from program execution on the emulator, and performs any additional analysis needed for compliance with the most stringent coverage requirements.

Final verification on the target platform is simplified; it entails rerunning the tests and showing that the results are the same as on the emulator. This approach fully supports all levels of safety certification for DO-178B and for its upcoming revision, DO-178C.

Verification challenges

A major verification activity specified in safety certification standards such as DO-178B[1] is test coverage analysis, which involves demonstrating that each software requirement is met and showing that the requirements-based tests completely cover the source code. Coverage analysis raises several issues:

·    Instrumentation: A common approach is to use a tool that generates a modified (instrumented) version of the application source code, or to compile the application with a special switch to generate instrumented object code. The added code contains calls to appropriate logging functions. However, the instrumented code is not the code that will run on the final system. To use the coverage data, the developer must demonstrate that it also applies to the uninstrumented executable. This is not necessarily a simple task.

·    Target hardware: Although final software/hardware integration testing must be performed on the actual configuration to be deployed, it is both expensive and inconvenient to require the target board during component development. A host-based solution is simpler and more cost-effective.

·    Source versus object coverage: DO-178B requires coverage of the source code, but the coverage data is computed from the executing program. At the highest safety criticality (Level A), special analysis might be needed to demonstrate Modified Condition/Decision Coverage (MC/DC).

The technology described here addresses these issues. It is based on deriving source coverage metrics from execution trace data produced by a host-resident target emulator running an uninstrumented version of the application software.

DO-178B test coverage analysis

DO-178B specifies two types of test coverage analysis [1, §6.4.4]:

·    Requirements-based test coverage analysis: The developer must show traceability from each requirement to the source code that implements the requirement, and to a test suite whose execution provides confidence that the requirement is implemented correctly.

·    Structural coverage analysis: The developer must show that the code structure has been completely exercised by the requirements-based tests. If these tests do not completely cover the source code, then the developer must add further requirement(s), add further test(s), and/or remove code – referred to as “dead code” (DO-178B) or “extraneous code” (DO-178C) – that is not traceable to requirements.

The extent of the needed coverage depends on the software component’s safety criticality level. At Level C, only statement coverage is required; that is, each statement in the program must be executed at least once.

At Level B, decision coverage is required. In DO-178B parlance, a decision is a complete Boolean expression consisting of atomic Boolean terms (conditions) and Boolean operators. For example, the following Boolean expression is a decision with three conditions:

(B1 and then B2) or else B3

This example uses the Ada and then and or else short-circuit forms that evaluate their right operand only when necessary, corresponding respectively to the && and || operators in C. Decision coverage requires each decision in the program to be exercised by tests for both true and false.

At Level A, MC/DC is required:

·    Each condition in the program must be exercised by tests for both true and false.

·    Each decision in the program must be exercised by tests for both true and false.

·    Each condition must be shown to independently affect the decision’s outcome (with that condition varying while all other conditions remain fixed).

MC/DC does not require each decision to be tested with every possible combination of truth-values for its constituent conditions. This would be unrealistic for complex decisions and perhaps impossible when conditions are coupled (when the same input variable appears in multiple conditions).

Figure 1 shows a program fragment that illustrates the differences among the various kinds of structural coverage. MC/DC, which has some subtle characteristics, is discussed comprehensively in a tutorial by Hayhurst et al[2] and a detailed study by Chilenski[3].

Figure 1: A program fragment shows different kinds of DO-178B structural coverage.

Source versus object coverage

A commonly misunderstood requirement in DO-178B concerns the type of coverage (source versus object code) that must be demonstrated at Level A. Section states:

The structural coverage analysis may be performed on the Source Code, unless the software level is A and the compiler generates object code that is not directly traceable to Source Code statements. Then, additional verification should be performed on the object code to establish the correctness of such generated code sequences. A compiler-generated array-bound check in the object code is an example of object code that is not directly traceable to the Source Code.

This requirement, whose wording is being revised in DO-178C, does not say that object coverage must be demonstrated for Level A. Instead, it addresses the issue of source language constructs whose compiled object code contains conditional branches or side effects not apparent from the source code. In such cases the developer must verify the generated code, for example by explaining the effect of each non-traceable object code sequence. But coverage analysis still must relate to the source code structures. Showing object code coverage is not sufficient unless further analysis can demonstrate its equivalence to source code coverage.

Target emulation through virtualization

The concept of simulating a target processor on a host system is not new, but recent advances in virtualization technology have spawned an efficient and portable approach exemplified by the open-source Quick EMUlator (QEMU) tool[4]. QEMU supports full system emulation with guest operating systems and allows simulation of specific embedded devices through machine descriptions. It runs on a host platform and, in a two-stage process, dynamically translates object code into native host instructions, using a caching scheme for efficiency. The tool first translates target code into an intermediate language and then compiles the intermediate representation into host binary instructions.

The dynamic translator operates on sections of uninstrumented target code at a time, interleaving translation (or cache fetch) with execution of the translated instructions. When QEMU starts processing a section of target code, it translates the instructions to host code until it reaches the next branch. The translated target code, known as a translated block, is stored in a cache if not already there, and its corresponding host instructions are executed. QEMU then continues translating where it left off. Because of the caching, blocks of target instructions only need to be decoded once. And in practice, because host processors are typically faster than the embedded target hardware, QEMU’s virtualization approach provides better performance than direct execution on the target.

QEMU is open-source technology that can be extended to provide additional functionality. To handle structural coverage analysis as required by DO-178B, a useful enhancement is support for generating execution traces. Two kinds of trace information are relevant:

·    Summary traces: The output identifies the address ranges of the instructions that were executed, and, for conditional branches, which branch(es) was (were) taken. The output data has bounded size (actually linear with respect to object program size), as it only reveals which instructions/branches were executed and not the entire execution history.

·    Full historical traces for specified address ranges: In addition to indicating the instructions that were executed, the output shows which branch was taken at each evaluation of the relevant conditional expressions. The size of the output data depends on the execution history.

An adapted version of QEMU that produces these execution traces is a key component of coverage analysis technology.

Coverage analysis

Although the execution trace data supplies object instruction coverage and object branch coverage information, further analysis is needed to satisfy DO-178B’s coverage objectives:

·    The traces must be mapped to source code structure, particularly to constructs in the source code (statements, decisions, conditions) that have coverage requirements.

·    The level of coverage achieved – statement, decision, MC/DC – must be assessed.

To enable this analysis, the compiler can preserve the source program’s decision structure in the object control flow graph and generate two kinds of output:

·    Debugging information (DWARF), which relates each object code instruction to a source code location (file, line, column).

·    Source Coverage Obligations (SCOs), which provide a compact representation of the program’s constructs that require evidence of fulfilling some coverage objective. SCOs capture the structure of all decisions in the program.

Using trace data from the emulator as well as the DWARF and SCO information provided by the compiler, a coverage analysis tool can deduce whether the test’s execution accomplished the desired level of coverage (statement, decision, MC/DC).

Determining whether the execution trace data implies MC/DC presents some challenges. One issue is how to infer source condition evaluation from object branch coverage. This can be handled if the program uniformly uses short-circuit forms (“and then”, “or else”) instead of the non-short-circuited operators (“and”, “or”)[5]. The compiler, as directed by an option, preserves the source code’s conditional structure in the generated object code. A second issue is whether it is possible, for efficiency reasons, to use only the summary traces and not the full historical traces. In general, the answer is “no,” and a relatively simple decision shows why:

(B1 and then B2) or else B3

The object code for this decision can be branch-covered by just three test cases, as shown in Table 1.

Table 1: Tests for object branch coverage of (B1 and then B2) or else B3.

However,  MC/DC requires at least n+1 tests when there are n independent conditions[2], and thus four here (see again Figure 1). This means that the trace summary data (object branch coverage) is not sufficient; full historical trace data is needed. A mathematical characterization of when object branch coverage is sufficient to deduce MC/DC is given in Bordin et al[6] and Comar et al[7].

Further issues to be addressed when object code coverage is proposed as evidence for MC/DC are documented in several certification authority reports[8, Section 20][9].

Putting it all together

The target virtualization approach has been implemented as part of an effort – the Couverture (Coverage) Project[6] – to provide an open framework for coverage analysis for safety-critical software development. AdaCore’s GNATemulator tool is an adaptation of QEMU that collects the execution trace data. The GNAT compiler compiles an application source program with switches that preserve the conditional control flow in the object code and generate the DWARF and SCO data. The uninstrumented executable is then run on GNATemulator, producing the execution trace data. Using the information generated by the compiler and the emulator, the GNATcoverage tool assesses whether the required structural coverage has been achieved. If necessary, the tool analyzes the full historic trace data to verify MC/DC. Figure 2 depicts a typical development scenario.

Figure 2: Virtualization and coverage analysis accurately assess structural coverage.

These tools currently work on applications written in Ada, a language that is used frequently in the safety-critical domain. Future versions will support other languages, including C. Target architectures currently supported include PowerPC and LEON.

Efficient target virtualization, coupled with a tool that deduces precise source-level coverage metrics from execution trace data for a non-instrumented/unmodified user program, mark an advance in the state of the art. This technology is especially valuable in safety-critical contexts, supporting safety certification at all levels while simplifying the certification effort.

ECD in 2D: AdaCore sales and business development manager Michaël Friess explains how GNATcoverage and GNATemulator help programmers develop certifiable applications. Use your smartphone, scan this code, watch a video:


[1] RTCA SC-167/EUROCAE WG-12. DO-178B – Software Considerations in Airborne Systems and Equipment Certification; December 1992.

[2] K. Hayhurst, D. Veerhusen, J, Chilenski, L. Rierson. A Practical Tutorial on Modified Condition/Decision Coverage. NASA/TM-2001-210876; May 2001.

[3] J. Chilenski. An Investigation of Three Forms of the Modified Condition Decision Coverage (MCDC) Criterion. DOT/FAA/AR-01/18; April 2001.

[4] QEMU open-source processor emulator.

[5] G. Romanski. MCDC Coverage Analysis Using Short-Circuit Conditions; White Paper; Verocel, Inc.

[6] M. Bordin, C. Comar, T. Gingold, J. Guitton, O. Hainque, T. Quinot. “Object and Source Coverage for Critical Applications with the Couverture Open Analysis Framework.” Embedded Real-Time Software and Systems (ERTS²) 2010; Toulouse, France.

[7] C. Comar, J. Guitton, O. Hainque, T. Quinot. “Formalization and Comparison of MCDC and Object Branch Coverage Criteria.” Submitted to Embedded Real-Time Software and Systems (ERTS²) 2012.

[8] European Aviation Safety Agency. Certification Memorandum EASA CM-SWCEH-002, Issue 01; Software Aspects of Certification. August 2011.

[9] Certification Authorities Software Team (CAST). Position Paper CAST-17, Structural Coverage of Object Code (Rev 3); June 2003.

Benjamin M. Brosgol is a senior member of the technical staff at AdaCore. He has been involved with programming language design and implementation for more than 30 years, serving in the design team for Ada 95 and in expert groups for several Java Specification Requests. Ben holds a BA in Mathematics from Amherst College and MS and PhD degrees in Applied Mathematics from Harvard University.

Twitter: @AdaCoreCompany

Benjamin M. Brosgol, PhD (AdaCore)
Previous Article
Maximizing the benefits of Model-Based Design through early verification
Maximizing the benefits of Model-Based Design through early verification

Implementing these four best practices in Model-Based Design leads to early verification and decreased test...

Next Article
Getting rid of the denominator: Looking beyond performance per watt in embedded systems
Getting rid of the denominator: Looking beyond performance per watt in embedded systems

For embedded systems requiring scalable processing, adaptive power management approaches provide peak perfo...