Skip to content

Conversation

@shanthanu47
Copy link

@shanthanu47 shanthanu47 commented Dec 12, 2025

Description

This PR addresses a Time-of-Check to Time-of-Use (TOCTOU) race condition in RepoManager.

The Issue:
Concurrent parsing requests for the same repository could overwrite the metadata.json file or result in corrupted JSON, leaving the repository in a broken state.

The Fix:
I added portalocker to the dependencies and implemented a cross-platform file locking mechanism in RepoManager.register_repo. This ensures that writes to the metadata file are serialized.

Test Plan:
I verified the fix using a custom script (reproduce_race.py) that simulated concurrent registration attempts. The script confirmed that the lock prevented data corruption.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed concurrent repository metadata writes to prevent conflicts and data corruption.
    • Improved agent tool error handling to correctly support async/sync tools and return clearer error messages.
  • Chores

    • Added a dependency to support the new concurrency safeguards and improve stability under parallel workloads.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 12, 2025 20:42
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 12, 2025

Warning

Rate limit exceeded

@shanthanu47 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 21 minutes and 24 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 4905490 and 1697d65.

📒 Files selected for processing (2)
  • app/modules/intelligence/agents/chat_agents/pydantic_agent.py (1 hunks)
  • app/modules/repo_manager/repo_manager.py (2 hunks)

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Added file-locking around repository metadata writes using portalocker with temp-file atomic replace, added portalocker to dependencies, and updated exception-wrapping in the Pydantic chat agent to produce distinct async/sync wrappers that log full stack traces and return descriptive error strings.

Changes

Cohort / File(s) Summary
Repo metadata write synchronization
app/modules/repo_manager/repo_manager.py
Imported portalocker; ensure metadata directory exists; create a .lock in the metadata dir and acquire portalocker.Lock before writing; write to a temporary file and atomically replace the original while holding the lock; moved the write path inside the locked critical section.
Dependency addition
requirements.txt
Added portalocker==3.1.0.
Agent error-handling behaviour
app/modules/intelligence/agents/chat_agents/pydantic_agent.py
Reworked handle_exception(tool_func) to detect coroutine vs sync functions and return an async_wrapper or sync_wrapper respectively; each wrapper logs full stack traces with logger.exception (mode-specific messages) and returns a descriptive error string including the exception text instead of a generic message.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Inspect repo_manager.py for correct lock file path, creation permissions, and that directory existence is ensured before locking.
  • Verify the temp-file write + atomic replace is safe on target filesystems and that exceptions release locks correctly.
  • Check that lock scope is minimal and does not block unrelated operations.
  • Review pydantic_agent.py wrappers for correct async detection, exception logging, and no behavioral regression in sync/async tool invocation.

Poem

🐇 I nudged a lock into the ground,
Quiet paws in circle round.
Temp-file snug, then swapped with care,
No more races in the air.
Logs hum soft — metadata safe and sound. 🎋

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(repo-manager): add file locking to prevent race conditions' accurately describes the main change—adding file locking with portalocker to the repo_manager module to fix a TOCTOU race condition, which is the primary focus of the PR.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/modules/repo_manager/repo_manager.py (1)

8-13: Understand that portalocker.Lock uses advisory locks on Linux/Unix (cooperative) and mandatory locks on Windows.

portalocker.Lock(lock_file_path, timeout=10) uses:

  • Windows: OS-enforced mandatory file lock; exclusive behavior is guaranteed.
  • Linux/Unix: POSIX advisory lock (fcntl); only prevents cooperating processes that also use portalocker. Non-cooperating writers bypass the lock.

The timeout=10 (in seconds, with 0.25s check interval) is reasonable for most workloads; portalocker retries until timeout or success, then raises an error if the lock cannot be acquired.

If your metadata writers (register_repo, update_last_accessed_write_metadata_entry) do not all use portalocker, concurrent writes to the same *.json.tmp file remain possible on Linux, risking JSON corruption. Ensure all writers use the same locking mechanism.

Also fix the unused locked_file variable at line 326.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c725a6f and 77f1d04.

📒 Files selected for processing (2)
  • app/modules/repo_manager/repo_manager.py (2 hunks)
  • requirements.txt (1 hunks)
🧰 Additional context used
🪛 GitHub Check: SonarCloud Code Analysis
app/modules/repo_manager/repo_manager.py

[warning] 326-326: Remove the unused local variable "locked_file".

See more on https://sonarcloud.io/project/issues?id=getmomentum_momentum-server&issues=AZsUTapgAuPCjZm6vXoE&open=AZsUTapgAuPCjZm6vXoE&pullRequest=528

🪛 Ruff (0.14.8)
app/modules/repo_manager/repo_manager.py

326-326: Local variable locked_file is assigned to but never used

Remove assignment to unused variable locked_file

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Agent
🔇 Additional comments (1)
requirements.txt (1)

175-177: No actionable concerns found for portalocker==3.1.0.

The version is stable (supports Python 3.9+), has no direct security advisories, and is released under the permissive BSD-3-Clause license. The transitive CVE-2021-32559 (in the pywin32 dependency) does not affect portalocker itself.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a Time-of-Check to Time-of-Use (TOCTOU) race condition in the RepoManager by implementing file locking for metadata writes during repository registration.

  • Adds portalocker==3.1.0 dependency for cross-platform file locking
  • Implements file locking in register_repo method to serialize concurrent metadata writes
  • Creates lock files at the repository directory level to prevent data corruption

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 4 comments.

File Description
requirements.txt Adds portalocker 3.1.0 dependency for file locking functionality
app/modules/repo_manager/repo_manager.py Implements file locking in register_repo method to prevent concurrent write conflicts

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Use portalocker for cross-platform file locking
# Prevent concurrent registrations for the same repo key
# if another process is currently registering it.
with portalocker.Lock(lock_file_path, timeout=10) as locked_file:
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unused variable 'locked_file' should be replaced with an underscore to indicate it's intentionally unused, following Python conventions. The portalocker.Lock context manager returns a file handle that is not being used in this implementation.

Suggested change
with portalocker.Lock(lock_file_path, timeout=10) as locked_file:
with portalocker.Lock(lock_file_path, timeout=10) as _:

Copilot uses AI. Check for mistakes.
).parent
os.makedirs(repo_dir, exist_ok=True)

lock_file_path = repo_dir / ".lock"
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock_file_path construction mixes PathLib and string concatenation incorrectly. The variable 'repo_dir' is a Path object, but the division operator with a string creates a new Path. However, the code should ensure consistent Path usage. More critically, the lock file is placed at the directory level (one lock per repo_name), which means different branches/commits of the same repository will share the same lock file. This could cause unnecessary serialization of independent operations on different branches or commits.

Copilot uses AI. Check for mistakes.
Comment on lines 326 to 327
with portalocker.Lock(lock_file_path, timeout=10) as locked_file:
self._write_metadata_entry(repo_name, branch, commit_id, data)
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The locking mechanism only protects the register_repo method, but update_last_accessed (line 352-371) also performs a read-modify-write operation on the metadata file without locking. This creates a race condition where concurrent calls to update_last_accessed or between register_repo and update_last_accessed could still result in data corruption or lost updates. Consider applying the same locking mechanism to update_last_accessed and any other methods that modify metadata.

Copilot uses AI. Check for mistakes.
Comment on lines 315 to 319
# Ensure the repository parent directory exists
repo_dir = self._metadata_path(
repo_name=repo_name, branch=branch, commit_id=commit_id
).parent
os.makedirs(repo_dir, exist_ok=True)
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory creation here is redundant. The _write_metadata_entry method already creates parent directories at line 151 with 'path.parent.mkdir(parents=True, exist_ok=True)'. This duplication adds unnecessary complexity and could be removed.

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant