mutation testing Wiki

Overview

In the provided case study, mutation testing is represented by modifying an executable model to create a mutant model and then running generated test cases against that altered model. The purpose stated in the evidence was to evaluate the quality of generated test cases: after the authors introduced changes into the executable model, a majority of the tests detected the errors. [C1]

Case-study context

The evidence concerns a VAMP processor executable model expressed as generated SML code. The model includes an instruction datatype and functions such as int_add, int_sub, cell2data, exec_instr, sigma_0, and execInstrs. The instruction datatype and function definitions were generated from corresponding model definitions. [C2]

Using HOL-TestGen test script generation, the authors generated two test scripts: one for load/store operation sequences and one for arithmetic operation sequences. In both cases, 585 test cases were generated and transformed into executable testers. When those tests were run on the original executable model, no errors were revealed, because the same model was used for both test generation and execution. [C3]

Mutant model construction

To evaluate test quality, the authors introduced three changes into the generated SML code, specifically in the int_add, int_sub, and cell2data operations. These changes produced a mutant model. [C4]

Observed results

For the arithmetic-operation tests on the mutant model, 303 of 585 test cases succeeded and 282 of 585 failed. The reported failure rate was approximately 49%, with no warnings, errors, or fatal errors reported. [C5]

For the load/store-operation tests on the mutant model, 54 of 585 test cases succeeded and 531 of 585 failed. The reported failure rate was approximately 91%, with no warnings, errors, or fatal errors reported. [C6]

Interpretation within the evidence

The evidence states that, after the mutant model was produced by changing the executable model, a majority of tests detected the introduced errors. In this case, failures in the generated executable testers served as the observable signal that the test suite detected behavior changed by the mutations. [C1]