ParseResponse serialization duplicates keys

ParseResponse serialization produces both class and class_ fields when using `parse` method and dumping to file:

`response.model_dump_json(indent=2)`

produces json like this:
```json
{
  "chunks": [...], 
  "class_": "full",  # <<<<<<<
  "identifier": "full",
  "markdown": "<string>",
  "pages": [0],
  "class": "full"  # <<<<<<<
}
```

further more downloading blob from s3 by using the job output_url and then trying to load the model like this:
```python
import json
from io import BytesIO
import httpx

async def download_blob(presigned_url: str):
    async with httpx.AsyncClient() as client:
        async with client.stream("GET", presigned_url) as response:
            response.raise_for_status()
            buffer = BytesIO()
            async for chunk in response.aiter_bytes():
                buffer.write(chunk)
            return buffer
buf = await download_blob(response.output_url)
parsed = ParseResponse.model_validate_json(buf.getvalue())
```

Fails with:
```
ValidationError: 1 validation error for ParseResponse splits.0.class
Field required [type=missing, input_value={'class_': 'full', 'ident...a94-9681-e3c8860228dd']}, input_type=dict]
```
using `ParseResponse.model_validate_json(buf.getvalue(), by_name=True)` succeeds this not found in the documentation. 

**Expected behavior:**

model_dump_json() should produce only 'class' or 'class_' not both
JSON from S3 should deserialize correctly 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ParseResponse serialization duplicates keys #64

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ParseResponse serialization duplicates keys #64

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions