Skip to content

feat: Implement Query Result Retention Policy for RuntimeDB #123

@shefeek-jinnah

Description

@shefeek-jinnah

Summary

Add configurable retention policy to automatically delete old query results and reclaim storage space.

Background

Query results are persisted as parquet files in S3 under cache/_runtimedb_internal/runtimedb_results/. These files accumulate indefinitely with no cleanup mechanism, causing storage to grow unbounded

Storage Structure

S3 Bucket
└── cache/_runtimedb_internal/runtimedb_results/
└── rslt{id}/
└── data.parquet/
└── part.1 (actual query output as parquet)

Requirements

  1. Configuration
    RUNTIMEDB_RESULT_RETENTION_DAYS=7 # Default: 7 days, 0 = disabled

  2. Cleanup Logic

  • Delete results older than retention period
  • Remove from both:
    • Catalog DB: DELETE FROM results WHERE created_at < NOW() - INTERVAL 'X days'
    • S3: Delete corresponding runtimedb_results/{result_id}/ folders

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions