Skip to content

File attachments added with file_data appear as persisted OpenAI Storage files objects for large documents #109

@shahin43

Description

@shahin43

Hi team,

To start with, thanks alot for providing Chatkit SDK and UI component. Really appreciateand amazing work you guys doing with Chatkit, its been great to build using this.

I'm not sure if this issue is specifically related to Chatkit, as it seems like with Responses API.

We’re following the ChatKit Python attachments guide and converting PDFs stored in our blob store (S3) into model inputs using:

ResponseInputFileParam(type="input_file", file_data="data:application/pdf;base64,...")

When the PDF exceeds a certain size threshold (we’ve observed this around >2MB), the document appears as a new File in the OpenAI Platform Files storage—even though our application code does not explicitly call the Files API (/v1/files) or the Uploads API.

Expected

Passing a PDF as inline base64 via file_data should not result in a persisted OpenAI File object unless we explicitly upload it (e.g., via /v1/files or by completing an Upload).

If the SDK/ responses api, needs to upload large inline payloads under the hood, is this behaviour documented, configurable, and ideally to apply a default expiry policy.

Actual

For larger PDFs, a new entry consistently appears in OpenAI Files storage after a run.

Implementation reference (per ChatKit guide)

We’re using the approach described here: https://openai.github.io/chatkit-python/guides/accept-rich-user-input/#attachments-let-users-upload-files

if attachment.mime_type == "application/pdf":
    return ResponseInputFileParam(
        type="input_file",
        file_data=as_data_url("application/pdf", content),
        filename=attachment.name or "unknown",
    )

From backend service, I see only the Responses API is getting invoked (/v1/responses) when sending the PDF content. This means that for larger payloads the responses api may be switching to an upload-based flow (which would create a File object).

Can you confirm whether this is expected behaviour ? If this is expected, is there a way to avoid persisting these files in OpenAI storage or to enforce a short default expiry window for any files created as part of this flow ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions