test purpose Wiki — STIMSMITH

Definition

In Brucker et al.'s microprocessor test-generation case study, a test purpose is used as a constraint in a test specification to reduce the domain of generated tests to operations of interest. For example, the predicate is_load_store is introduced in the precondition of a test specification to restrict generated tests to load/store operations, while is_arith restricts tests to arithmetic operations and is_branch restricts tests to branching operations. [Test purposes constrain generated domains]

Role in test specifications

A test purpose appears as a predicate in the formal test specification. For a unit-test-style load/store scenario, the evidence gives the pattern:

test_spec is_load_store ι ⇒ SUT σ0 ι =k exec_instr σ0 ι
apply (gen_test_cases 0 1 SUT)
store_test_thm load_store_instr

Introducing is_load_store in the precondition reduces the generated-test domain to load/store operations. HOL-TestGen then performs case splitting over the instruction datatype, generates symbolic operands, and produces symbolic test cases together with uniformity hypotheses. [Load/store test-purpose use]

Arithmetic operations are handled analogously by constraining the test specification with is_arith:

test_spec σ= exec_instr σ0i ⇒ is_arith i ⇒ SUT σ0i σ
apply (gen_test_cases 0 1 SUT)
store_test_thm arith_instr

In that scenario, each arithmetic operation is covered by one generated test case at the stated stage. [Arithmetic test-purpose use]

Branching operations use the same pattern with is_branch:

test_spec is_branch i ⇒ SUT σ0i =k exec_instr σ0i
apply (gen_test_cases 0 1 SUT)
store_test_thm branch_instr

This generates unit test cases for branching operations from the initial state σ0. [Branch test-purpose use]

Unit tests and sequence tests

Test purposes are used both for individual instructions and for instruction sequences. For sequences, the predicate is lifted to lists with list_all. For load/store instruction sequences, the evidence gives:

test_spec list_all is_load_store (ιs::instr list) ⇒
        (σ0 |=(s ←mbind ιs execVAMP; assertSE (λσ. σ=k SUT σ0ιs)))
apply (gen_test_cases SUT)
store_test_thm load_stre_instr_seq

The same approach is used for arithmetic sequences with list_all is_arith, and for branch sequences with list_all is_branch. [Sequence test-purpose use]

Effect on generated tests

When a test purpose constrains the domain, HOL-TestGen can produce schematic test cases and associated uniformity hypotheses. In the load/store unit scenario, the generator produced 8 symbolic test cases corresponding to different load and store operations, with 8 associated uniformity hypotheses. Concrete test data are then instantiated using gen_test_data; one example is SUT σ0(Ilb 1 0 1) σ1, where σ1 is the expected final state. [Schematic tests and concrete witnesses]

For load/store sequences, generated tests can target faults in read/write sequencing. The evidence notes that such programs may reveal sequencing errors even when individual operations are correctly implemented, including errors related to byte alignment or information loss due to pipelining. [Sequence tests reveal interaction faults]

Granularity and coverage

Test purposes also affect test granularity. In the arithmetic scenario, the evidence states that HOL-TestGen's default granularity is coarse because one value satisfying all constraints over an integer variable is selected. A stated workaround is to add more case distinctions to the test-purpose definitions, such as ranges or distinguished values like MinInt and 0, thereby creating finer constraints for which test selection must find solutions. [Granularity via test-purpose refinements]

The evidence also describes adding constraints to reduce the uniformity domain, such as bounding address ranges or constraining loads to addresses written by stores. Such constraints can improve selected-data coverage by dividing the uniformity domain into interesting subdomains. [Coverage via constrained domains]

State-space control

Test purposes are important for controlling state-space explosion in sequence generation. The VAMP assembly language has 56 variants, so sequence case splitting grows quickly; sequences of length 3 can generate 56 + 56² + 56³ = 178808 cases. The evidence states that constraints at the level of test purposes were imposed to address this issue; for example, a purpose such as list_all is_logic ι reduces the sequence space to 7 + 7² + 7³, described as manageable. [State-space reduction by test purposes]

Limitations noted in the evidence

The evidence distinguishes unit and sequence scenarios. Unit tests assume stronger testability, including access to registers and memory through the test driver. Sequence tests make weaker testability assumptions because they rely on observed behavior, but they are more vulnerable to state-space explosion. [Unit versus sequence tradeoff]

For branching operations, the evidence notes a limitation of fixed initial-state scenarios: branch behavior depends on flag values, so varying initial states with different flags would be more interesting. [Branching depends on flags]

test purpose