refactor(dataset): Redirect multipart upload through File Service #4136

carloea2 · 2025-12-18T22:38:48Z

What changes were proposed in this PR?

DB / schema
- Add dataset_upload_session to track multipart upload sessions, including:
  - (uid, did, file_path) as the primary key
  - upload_id (UNIQUE), physical_address
  - num_parts_requested to enforce expected part count
- Add dataset_upload_session_part to track per-part completion for a multipart upload:
  - (upload_id, part_number) as the primary key
  - etag (TEXT NOT NULL DEFAULT '') to persist per-part ETags for finalize
  - CHECK (part_number > 0) for sanity
  - FOREIGN KEY (upload_id) REFERENCES dataset_upload_session(upload_id) ON DELETE CASCADE
Backend (DatasetResource)
- Multipart upload API (server-side streaming to S3, LakeFS manages multipart state):
  - POST /dataset/multipart-upload?type=init
    - Validates permissions and input.
    - Creates a LakeFS multipart upload session.
    - Inserts a DB session row including num_parts_requested.
    - Pre-creates placeholder rows in dataset_upload_session_part for part numbers 1..num_parts_requested with etag = '' (enables deterministic per-part locking and simple completeness checks).
    - Rejects init if a session already exists for (uid, did, file_path) (409 Conflict). Race is handled via PK/duplicate handling + best-effort LakeFS abort for the losing initializer.
  - POST /dataset/multipart-upload/part?filePath=...&partNumber=...
    - Requires dataset write access and an existing upload session.
    - Requires Content-Length for streaming uploads.
    - Enforces partNumber <= num_parts_requested.
    - Per-part locking: locks the (upload_id, part_number) row using SELECT … FOR UPDATE NOWAIT to prevent concurrent uploads of the same part.
    - Uploads the part to S3 and persists the returned ETag into dataset_upload_session_part.etag (upsert/overwrite for retries).
    - Implements idempotency for retries by returning success if the ETag is already present for that part.
  - POST /dataset/multipart-upload?type=finish
    - Locks the session row using SELECT … FOR UPDATE NOWAIT to prevent concurrent finalize/abort.
    - Validates completeness using DB state:
      - Confirms the part table has num_parts_requested rows for the upload_id.
      - Confirms all parts have non-empty ETags (no missing parts).
      - Optionally surfaces a bounded list of missing part numbers (without relying on error-message asserts in tests).
    - Fetches (part_number, etag) ordered by part_number from DB and completes multipart upload via LakeFS.
    - Deletes the DB session row; part rows are cleaned up via ON DELETE CASCADE.
    - NOWAIT lock contention is handled (mapped to “already being finalized/aborted”, 409).
  - POST /dataset/multipart-upload?type=abort
    - Locks the session row using SELECT … FOR UPDATE NOWAIT.
    - Aborts the multipart upload via LakeFS and deletes the DB session row (parts cascade-delete).
    - NOWAIT lock contention is handled similarly to finish.
- Access control and dataset permissions remain enforced on all endpoints.
Frontend service (dataset.service.ts)
- multipartUpload(...) updated to reflect the server flow and return values (ETag persistence is server-side; frontend does not need to track ETags).
Frontend component (dataset-detail.component.ts)
- Uses the same init/part/finish flow.
- Abort triggers backend type=abort to clean up the upload session.

Any related issues, documentation, discussions?

Closes #4110

How was this PR tested?

Unit tests added/updated (multipart upload spec):
- Init validation (invalid numParts, invalid filePath, permission denied).
- Upload part validation (missing/invalid Content-Length, partNumber bounds, minimum size enforcement for non-final parts).
- Per-part lock behavior under contention (no concurrent streams for the same part; deterministic assertions).
- Finish/abort locking behavior (NOWAIT contention returns 409).
- Successful end-to-end path (init → upload parts → finish) with DB cleanup assertions.
- Integrity checks: positive + negative SHA-256 tests by downloading the finalized object and verifying it matches (or does not match) the expected concatenated bytes.
Manual testing via the dataset detail page (single and multiple uploads), verified:
- Progress, speed, and ETA updates.
- Abort behavior (UI state + DB session cleanup).
- Successful completion path (all expected parts uploaded, LakeFS object present, dataset version creation works).

Was this PR authored or co-authored using generative AI tooling?

GPT partial use.

xuang7 · 2025-12-21T05:24:58Z

sql/texera_ddl.sql

    FOREIGN KEY (did) REFERENCES dataset(did) ON DELETE CASCADE
    );

+CREATE TABLE IF NOT EXISTS dataset_upload_session


Could you also add a separate DDL update file for the new table? It would make it easier to apply the schema change.

Do you have an example in another pr I can follow?
Thanks

You can take a look at the files under sql/updates

Thanks got it

Done. Thanks

aicam

LGTM

aicam · 2025-12-22T20:54:33Z

common/workflow-core/src/main/scala/org/apache/texera/service/util/S3StorageClient.scala

      DeleteObjectRequest.builder().bucket(bucketName).key(objectKey).build()
    )
  }
+  def uploadPart(


Add comment and correct formatting to this function

aicam · 2025-12-22T21:33:15Z

file-service/src/main/scala/org/apache/texera/service/resource/DatasetResource.scala

+  @Path("/multipart-upload/part")
+  @Consumes(Array(MediaType.APPLICATION_OCTET_STREAM))
+  def uploadPart(
+      @QueryParam("ownerEmail") ownerEmail: String,


Why do we need ownerEmail here? we already have user id using their token and can be fetched, please move these queries to request body as JSON

Its up to you

Why do we need ownerEmail here? we already have user id using their token and can be fetched, please move these queries to request body as JSON

I am unsure if we can use email from user directly, maybe @xuang7 can confirm, if so I will proceed to change it as you mentioned, I was just thinking of cases when the dataset is shared....

Also I would like to keep query params + body purely to stream since it is a convenient way to use streaming application/octet-stream. What do you think?

aicam · 2025-12-22T21:48:24Z

common/workflow-core/src/main/scala/org/apache/texera/service/util/S3StorageClient.scala

+      inputStream: InputStream,
+      contentLength: Option[Long]
+  ): Unit = {
+    val body: RequestBody = contentLength match {


We need streaming here, it just read all bytes at once

Yes, for the case when user does not specify Content Length, we read all the bytes. (However this case is forbidden in uploadPart endpoint)

The case when user specify Content Length; RequestBody.fromInputStream(inputStream, contentLength /* = ex 5 GiB */) the SDK does not read and buffer the whole 5 GiB in memory first. For retries (depends in support), the SDK tries rewinding by using InputStream.reset() with a read limit of 128 KiB.

@aicam do you agree in this?

V1

00f8feb

github-actions bot added ddl-change Changes to the TexeraDB DDL refactor Refactor the code frontend Changes related to the frontend GUI service common labels Dec 18, 2025

carloea2 mentioned this pull request Dec 18, 2025

refactor(dataset): Redirect multipart upload through File Service #4130

Closed

carloea2 added 2 commits December 19, 2025 16:45

Added race conditions checks and rejects

8ffc17b

v2

7b518fd

carloea2 marked this pull request as ready for review December 19, 2025 23:10

carloea2 added 3 commits December 20, 2025 14:18

Initial Testing

86b5007

Update MockLakeFS.scala

4e696ae

fix&fmt

4443f04

xuang7 reviewed Dec 21, 2025

View reviewed changes

aicam requested changes Dec 22, 2025

View reviewed changes

aicam assigned carloea2 Dec 22, 2025

carloea2 added 2 commits December 24, 2025 01:33

v3

4fae751

Added corruption testing

6d09c64

refactor(dataset): Redirect multipart upload through File Service #4136

Are you sure you want to change the base?

refactor(dataset): Redirect multipart upload through File Service #4136

Uh oh!

Conversation

carloea2 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aicam left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

carloea2 commented Dec 18, 2025 •

edited

Loading