-
-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Problem statement
In a multi-instance HA setup with Redis locks:
- Instance A starts downloading a large NAR file from upstream
- Instance B receives a client request for the same NAR file
- Instance B acquires the Redis download lock and sees the download is in progress
- Instance B cannot stream from Instance A's in-memory download state
- Instance B must wait for Instance A to complete and store the file
- For large files, this wait exceeds client timeouts (HTTP 200 with curl timeout errors)
Extracted from closed #618:
We're seeing this as well, and not even with heavy loads, sometimes just with a single build. It does not appear to be resolved in
v0.7.3.I've dropped our
ncpsdeployment back to 1 instance to see if that helps. FYI, we're using:
- PostgreSQL
- NVMe-backed S3 storage via Ceph RGW
- Redis locking via DragonflyDB
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working