Skip to content
STIMSMITH

mutation testing

Concept WIKI v1 · 5/26/2026

Mutation testing is illustrated in the provided evidence as a way to evaluate generated test cases by changing an executable model to create a mutant model, running the same generated tests, and observing which tests fail. In the VAMP processor case study, changes to int_add, int_sub, and cell2data caused many generated test cases to fail, indicating that the tests detected the injected changes.

Overview

In the provided case study, mutation testing is represented by modifying an executable model to create a mutant model and then running generated test cases against that altered model. The purpose stated in the evidence was to evaluate the quality of generated test cases: after the authors introduced changes into the executable model, a majority of the tests detected the errors. [C1]

Case-study context

The evidence concerns a VAMP processor executable model expressed as generated SML code. The model includes an instruction datatype and functions such as int_add, int_sub, cell2data, exec_instr, sigma_0, and execInstrs. The instruction datatype and function definitions were generated from corresponding model definitions. [C2]

Using HOL-TestGen test script generation, the authors generated two test scripts: one for load/store operation sequences and one for arithmetic operation sequences. In both cases, 585 test cases were generated and transformed into executable testers. When those tests were run on the original executable model, no errors were revealed, because the same model was used for both test generation and execution. [C3]

Mutant model construction

To evaluate test quality, the authors introduced three changes into the generated SML code, specifically in the int_add, int_sub, and cell2data operations. These changes produced a mutant model. [C4]

Observed results

For the arithmetic-operation tests on the mutant model, 303 of 585 test cases succeeded and 282 of 585 failed. The reported failure rate was approximately 49%, with no warnings, errors, or fatal errors reported. [C5]

For the load/store-operation tests on the mutant model, 54 of 585 test cases succeeded and 531 of 585 failed. The reported failure rate was approximately 91%, with no warnings, errors, or fatal errors reported. [C6]

Interpretation within the evidence

The evidence states that, after the mutant model was produced by changing the executable model, a majority of tests detected the introduced errors. In this case, failures in the generated executable testers served as the observable signal that the test suite detected behavior changed by the mutations. [C1]

CITATIONS

6 sources
6 citations
[1] Mutation testing in the case study evaluated generated test cases by introducing changes into an executable model to produce a mutant model, after which many tests detected the errors. Test Program Generation for a Microprocessor: A Case Study
[2] The VAMP processor executable model was generated as ML/SML code and included instruction and execution-related definitions such as instr, int_add, int_sub, cell2data, exec_instr, sigma_0, and execInstrs. Test Program Generation for a Microprocessor: A Case Study
[3] HOL-TestGen generated two test scripts for load/store and arithmetic operation sequences; each produced 585 test cases that were transformed into executable testers, and the original executable model revealed no errors because it was also used for test generation. Test Program Generation for a Microprocessor: A Case Study
[4] The mutant model was produced by introducing three changes into int_add, int_sub, and cell2data in the generated SML code. Test Program Generation for a Microprocessor: A Case Study
[5] For arithmetic-operation tests on the mutant model, 303 of 585 test cases succeeded and 282 of 585 failed, with no warnings, errors, or fatal errors. Test Program Generation for a Microprocessor: A Case Study
[6] For load/store-operation tests on the mutant model, 54 of 585 test cases succeeded and 531 of 585 failed, with no warnings, errors, or fatal errors. Test Program Generation for a Microprocessor: A Case Study