RFC: How should deletions be represented in the trace schema?

The terminology in section 4 defines a contribution as "a unit of code change (addition, modification, or deletion)." But the Range schema can only represent lines that exist in the file. There's no way to record that lines were deleted.

## The problem

A range in the current schema means "these lines exist in the file and this contributor is attributed to them":

```json
{
  "start_line": 20,
  "end_line": 35,
  "contributor": { "type": "ai", "model_id": "anthropic/claude-sonnet-4-20250514" }
}
```

A consumer reads this and understands: lines 20-35 were written by Claude Sonnet. The lines exist, you can go look at them.

Now consider a deletion. An AI removes lines 20-35 from a 50-line file. The file is now 34 lines. If you record a range with `start_line: 20, end_line: 35`, per section 6.5 those line numbers reference the file at the recorded revision. At that revision the lines still exist, so a consumer can find them. But the consumer would interpret this range the same way as any other range: "this contributor wrote these lines." There's nothing that says "this contributor deleted these lines." A deletion range looks identical to an addition range.

The schema has no concept of what kind of change a range represents. It doesn't distinguish additions from modifications from deletions. For additions and modifications that doesn't really matter, both just mean "these lines are attributed to this contributor." For deletions the meaning is inverted. It's not "who wrote this code" but "who removed this code." That's a fundamentally different statement and the schema can't express it.

AI agents delete code all the time. Removing dead code, replacing implementations, simplifying logic, cleaning up after refactors. None of this can be captured in the current schema.

## Options

**Option A: Add a `type` field to ranges**

```json
{
  "start_line": 20,
  "end_line": 35,
  "type": "deletion",
  "contributor": { "type": "ai", "model_id": "anthropic/claude-sonnet-4-20250514" }
}
```

A `type` field tells consumers how to interpret the range. For deletions, the line numbers reference the file at the recorded revision (where the lines still existed). Existing ranges without a `type` keep their current meaning: these lines are attributed to this contributor.

Backward compatible and simple. The trade-off is that consumers need to check the `type` before interpreting line numbers.

**Option B: Keep deletions out of the schema**

Maybe Agent Trace is about attributing existing code and tracking what was removed is a VCS concern. `git log` and `git diff` already capture deletions. Agent Trace doesn't need to duplicate that.

If that's the case, the terminology should drop "or deletion" from the Contribution definition so the spec doesn't promise something it can't deliver.

## What's the intended scope?

If Agent Trace answers "who wrote the code that's in this file right now," deletions are probably out of scope. If it answers "what did the AI do to this file," deletions are a significant part of the picture.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: How should deletions be represented in the trace schema? #11

The problem

Options

What's the intended scope?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: How should deletions be represented in the trace schema? #11

Description

The problem

Options

What's the intended scope?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions