Skip to content

Conversation

@itslenny
Copy link
Contributor

What kind of change does this PR introduce?

Bug fix

What is the current behavior?

There is a potential race condition in the S3 locker that can cause zombie locks to be created if the lock is unlocked during the renewal operation (Between get and put)

What is the new behavior?

Check that the lock is still locked after the S3 get operation completes and exit if it was already unlocked

Additional context

This issue was causing intermittent test failures which would manifest in the "double unlock" test but was actually caused by the abort test leaving behind zombie objects. Running the S3 locker tests in a loop I was able to reproduce this semi-consistently. Usually it would take only a few test runs, but sometimes it'd go upwards of 50 runs without a failure.

The newly added test expands on the "handle abort signal" test and consistently fails without this fix.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 20110090962

Details

  • 8 of 8 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.03%) to 75.96%

Totals Coverage Status
Change from base Build 20069848970: 0.03%
Covered Lines: 25433
Relevant Lines: 33205

💛 - Coveralls

}

async renewLock(id: string): Promise<boolean> {
async renewLock(id: string, checkLocked: () => boolean): Promise<boolean> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason why we need to pass a function a not check directly this.isLocked?

Copy link
Contributor Author

@itslenny itslenny Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in renewLock() wich is a part of S3Locker, but isLocked is part of S3Lock. So when S3Lock calls this.locker.renewLock() it passes in a function to allow S3Locker to check the lock state of S3Lock.

Alternatives would be

  • move the lock states to S3Locker with a map<string, boolean> (id -> isLocked) and share it that way
  • pass in the whole S3Lock instance (instead of checkLocked) and expose isLocked from S3Lock (currently private) so the locker can check the lock directly

Copy link
Contributor

@fenos fenos Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fix wouldn't work in a distributed environment, it would only work on a single instance setup.
I think the proper fix here is to use If-Match directive in the PutObject after retrieving the current lock, and only write the file if the etag matches with the current lock. ex:

 // Get current lock to verify ownership
      const response = await this.s3Client.send(
        new GetObjectCommand({
          Bucket: this.bucket,
          Key: lockKey,
        })
      )


// Update the lock with new expiration
      await this.s3Client.send(
        new PutObjectCommand({
          Bucket: this.bucket,
          Key: lockKey,
          IfMatch: response.Etag, // <--- here is the trick
          Body: JSON.stringify(updatedLock),
          ContentType: 'application/json',
          Metadata: {
            lockId: id,
            expiresAt: updatedLock.expiresAt.toString(),
          },
        })
      )

This way it won't create the zombie lock if there isn't a file with the same etag and it should work in a distributed environment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to add a check if (id === currentLock.id) before sending the put

@itslenny itslenny requested a review from fenos December 12, 2025 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants