[EPIC] Add Definitive Evaluation / assertions

## Problem Statement

Strands Evals currently requires LLM invocations for ALL evaluations, even for simple assertions that should be deterministic. This feature request proposes adding definitive (deterministic, non-LLM) evaluation capabilities to complement existing LLM-based evaluators, enabling fast, reliable, and cost-free assertions for structural and factual checks.

## Possible definitive evaluator / method
### Basic
- Equals
- Contains
- MaxDuration / MinDuration
- HasMatchingSpan
- StartsWith
- IsInstance

### Span-level
- HasSpan(name) - Check if span with name exists
- HasAttributes
- HasSpanWithAttributes(name, attributes) - Span with specific attributes
- HasMatchingSpan(query) - Complex query-based check
- SpanSequence
- HassErrorFlag

### Tool-level
- ToolCalled
- ToolSucceeded
- ToolResultContains
- ToolResultEquals

### Generic
- Custom: definitive callback


...

## Proposed Solution

### Solution 1: Add definitive matching method as part of evaluators
The implementation will look like:

```
evaluators = [
    # Definitive checks (fast, free, reliable)
    Equals(value=expected_output),
    HasSpanWithName(name="calculator"),
    MaxDuration(seconds=5.0),
    Contains(value="Paris"),
    
    # LLM-based checks (quality assessment)
    HelpfulnessEvaluator(),
    OutputEvaluator(rubric="Assess response quality"),
]
```

### Alternative 2: Separate Assertion Phase
Additional assertions field, need to think about the score and failure model (evaluator runs only when all definitive assertions pass).

```
assertions = [
    Equals(value=expected_output),
    HasSpanWithName(name="calculator"),
    MaxDuration(seconds=5.0),
]
evaluators = [
    HelpfulnessEvaluator(),
    OutputEvaluator(rubric="Assess response quality"),
]
experiment = Experiment(
    cases=test_cases,
    assertions=assertions,
    evaluators=evaluators
)
```

## Use Case

- As a developer testing my agent locally, I want instant feedback when my agent fails basic checks without LLM costs

- As a developer ensuring trace quality, I want to verify execution structure and metadata to catch instrumentation and configuration issues

- I want fast, cheap health checks so that I can continuously verify basic functionality

## Alternatives Solutions

Users can now set up its own custom evaluator with definitive assertions


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EPIC] Add Definitive Evaluation / assertions #109

Problem Statement

Possible definitive evaluator / method

Basic

Span-level

Tool-level

Generic

Proposed Solution

Solution 1: Add definitive matching method as part of evaluators

Alternative 2: Separate Assertion Phase

Use Case

Alternatives Solutions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[EPIC] Add Definitive Evaluation / assertions #109

Description

Problem Statement

Possible definitive evaluator / method

Basic

Span-level

Tool-level

Generic

Proposed Solution

Solution 1: Add definitive matching method as part of evaluators

Alternative 2: Separate Assertion Phase

Use Case

Alternatives Solutions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions