M
"

Directory

SUNRISE Safety Assurance Framework

Test Evaluate

Input Scenario Create Format Store Environment Query & Concretise Allocate Execute Safety Argument Coverage Test Evaluate Safety Case Decide Audit

The Test Evaluate block assesses each test execution to determine whether the system has passed or failed. For instance, did the system stay within an upper speed limit, or did it avoid collisions? Both the Coverage and Test Evaluate blocks contribute to the overall Analysis block, and the results are used to select further concrete parameters within the original scenario’s parameter ranges. After several iterations, once the coverage threshold is reached, the combined coverage and test evaluation results will feed into the Decide block, producing the overall safety assurance outcome for the system. Please note that pass/fail criteria are heavily governed by the use case (i.e., use case dependent), therefore SUNRISE cannot prescribe specific criteria, however, the categories below should be considered: 1) whether the intended test has been executed (e.g., whether intended cut-in has occurred), 2) use case specific pass/fail criteria received from the input layer, 3) optionally, user can consider baseline based pass/fail criteria, such as human driver model baseline.

 

SAF Application Guidelines for 'Test Evaluate'

By following the steps outlined below, users of the SUNRISE SAF can apply the Test Evaluate block to ensure that trustworthy evidence is produced that can be reliably used in building the overall safety argumentation of the CCAM system under test.  

In the list below, “D” stands for Deliverable. All deliverables of the SUNRISE project can be found here.

 

  1. Verify test run validation including proper application of test run validation metrics, to ensure that (D3.5 Section 5.2):
    1. The test execution was valid and meaningful
    2. The correct test instance was used for the scenario
    3. Test scenario importance was properly evaluated
      • Critical scenarios were appropriately prioritised

 

  1. Assess scenario realisation to verify that the scenario was meaningfully executed, by checking that (D3.5 Section 5.1):
    1. The CCAM system actually encountered the intended triggering conditions
    2. The CCAM system responded appropriately to the scenario’s defined logic
    3. The scenario behaviour phases that test the intended function, were reached
      • Any scenarios not properly realised were flagged as “not achieved”

 

  1. Evaluate safety confidence metrics by verifying that (D3.5 Section 5.2.1):
    1. An acceptable false acceptance risk was properly defined
    2. Uncertainty estimation at the test point was established
    3. Uncertainty was compared against false acceptance risk thresholds
    4. Sufficient confidence margins were maintained

 

  1. In case higher fidelity testing was performed, review the correlation analysis by checking that (D3.5 Section 5.2.3):
    1. Correlation between low and high fidelity test instances was examined
    2. Appropriate correlation methods were used (like the Pearson correlation coefficient)
    3. Statistical significance was verified (low p-value)
    4. Acceptable correlation levels were predefined before testing

 

  1. Validate expert analysis documentation by ensuring it includes (D3.5 Section 5.3):
    1. Root cause analysis for correlation discrepancies
    2. Assessment of re-allocation needs to different test instances
    3. Documentation of further investigations when required
    4. Expert explanations for anomalies or unexpected results