-
Notifications
You must be signed in to change notification settings - Fork 40
Description
The terminology in section 4 defines a contribution as "a unit of code change (addition, modification, or deletion)." But the Range schema can only represent lines that exist in the file. There's no way to record that lines were deleted.
The problem
A range in the current schema means "these lines exist in the file and this contributor is attributed to them":
{
"start_line": 20,
"end_line": 35,
"contributor": { "type": "ai", "model_id": "anthropic/claude-sonnet-4-20250514" }
}A consumer reads this and understands: lines 20-35 were written by Claude Sonnet. The lines exist, you can go look at them.
Now consider a deletion. An AI removes lines 20-35 from a 50-line file. The file is now 34 lines. If you record a range with start_line: 20, end_line: 35, per section 6.5 those line numbers reference the file at the recorded revision. At that revision the lines still exist, so a consumer can find them. But the consumer would interpret this range the same way as any other range: "this contributor wrote these lines." There's nothing that says "this contributor deleted these lines." A deletion range looks identical to an addition range.
The schema has no concept of what kind of change a range represents. It doesn't distinguish additions from modifications from deletions. For additions and modifications that doesn't really matter, both just mean "these lines are attributed to this contributor." For deletions the meaning is inverted. It's not "who wrote this code" but "who removed this code." That's a fundamentally different statement and the schema can't express it.
AI agents delete code all the time. Removing dead code, replacing implementations, simplifying logic, cleaning up after refactors. None of this can be captured in the current schema.
Options
Option A: Add a type field to ranges
{
"start_line": 20,
"end_line": 35,
"type": "deletion",
"contributor": { "type": "ai", "model_id": "anthropic/claude-sonnet-4-20250514" }
}A type field tells consumers how to interpret the range. For deletions, the line numbers reference the file at the recorded revision (where the lines still existed). Existing ranges without a type keep their current meaning: these lines are attributed to this contributor.
Backward compatible and simple. The trade-off is that consumers need to check the type before interpreting line numbers.
Option B: Keep deletions out of the schema
Maybe Agent Trace is about attributing existing code and tracking what was removed is a VCS concern. git log and git diff already capture deletions. Agent Trace doesn't need to duplicate that.
If that's the case, the terminology should drop "or deletion" from the Contribution definition so the spec doesn't promise something it can't deliver.
What's the intended scope?
If Agent Trace answers "who wrote the code that's in this file right now," deletions are probably out of scope. If it answers "what did the AI do to this file," deletions are a significant part of the picture.