Difficulty Ladder System

## Overview

Implement a difficulty ladder system inspired by CORE-Bench, where the same task can be presented at different difficulty levels based on how much scaffolding/context is provided.

## Difficulty Levels

* **Easy**: More context provided (e.g., relevant files identified, hints given)
* **Medium**: Standard context (e.g., just the task description)
* **Hard**: Minimal context (e.g., agent must discover relevant files, figure out approach)

This is different from simple difficulty categorization - it's the same underlying task with varying levels of assistance.

## Examples

**Task: "Add error handling to the API routes"**

* Easy: Points to specific files, shows which functions need handling
* Medium: Just the task description
* Hard: Agent must find API routes, identify unhandled cases, implement solution

**Task: "Fix the failing tests"**

* Easy: Test output provided, failing test identified
* Medium: Must run tests to see failures
* Hard: Must figure out how to run tests, interpret failures, fix issues

## Tasks

* Design difficulty ladder schema in case format
* Implement context/scaffolding levels
* Create generator for difficulty variants from base case
* Add difficulty selection to `sniff run` (e.g., `--difficulty hard`)
* Track performance across difficulty levels

## Acceptance Criteria

* Cases can define Easy/Medium/Hard variants
* Same underlying task, different scaffolding
* Metrics track performance per difficulty level
* Generated cases automatically create difficulty variants

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difficulty Ladder System #22

Overview

Difficulty Levels

Examples

Tasks

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Difficulty Ladder System #22

Description

Overview

Difficulty Levels

Examples

Tasks

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions