Skip to content

Conversation

@marceloneppel
Copy link
Member

@marceloneppel marceloneppel commented Dec 11, 2025

Issue

The charm fails with "directory already in use as a tablespace" error when running on persistent storage for the temp storage after a reboot in the host machine.

Solution

The issue occurred because the method assumed tmpfs storage and would rename the existing 'temp' tablespace and try to create a new one at the same location. With persistent storage, the directory remains in use by the renamed tablespace, causing PostgreSQL to reject the CREATE TABLESPACE command.

Changes:

  • Add is_tmpfs() utility function to detect filesystem type via /proc/mounts
  • Modify set_up_database() to differentiate between tmpfs and persistent storage:
    • tmpfs: Preserve existing behavior (rename + recreate for empty directory)
    • persistent: Only fix permissions, keep existing tablespace
  • Extract tablespace handling into _handle_temp_tablespace_on_reboot() helper method to reduce cyclomatic complexity
  • Add comprehensive test coverage (7 new tests for is_tmpfs, 1 for persistent storage handling)
  • Update existing tests to mock filesystem detection This ensures the library works correctly with both tmpfs and persistent storage configurations for the temp tablespace.

Bumps version to 16.1.5.

Checklist

  • I have added or updated any relevant documentation.
  • I have cleaned any remaining cloud resources from my accounts.

…Use errors

Fix a bug where set_up_database() fails with "directory already in use as a
tablespace" error when running on persistent storage after a reboot or failover.
The issue occurred because the method assumed tmpfs storage and would rename the
existing 'temp' tablespace and try to create a new one at the same location. With
persistent storage, the directory remains in use by the renamed tablespace, causing
PostgreSQL to reject the CREATE TABLESPACE command.
Changes:
- Add is_tmpfs() utility function to detect filesystem type via /proc/mounts
- Modify set_up_database() to differentiate between tmpfs and persistent storage:
  * tmpfs: Preserve existing behavior (rename + recreate for empty directory)
  * persistent: Only fix permissions, keep existing tablespace
- Extract tablespace handling into _handle_temp_tablespace_on_reboot() helper
  method to reduce cyclomatic complexity
- Add comprehensive test coverage (7 new tests for is_tmpfs, 1 for persistent
  storage handling)
- Update existing tests to mock filesystem detection
This ensures the library works correctly with both tmpfs and persistent storage
configurations for the temp tablespace.
Fixes cluster failures after reboot with persistent storage.
Bumps version to 16.1.5.

Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
@codecov
Copy link

codecov bot commented Dec 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 50.89%. Comparing base (5559a77) to head (5c8e58c).
⚠️ Report is 1 commits behind head on 16/edge.

❌ Your project status has failed because the head coverage (50.89%) is below the target coverage (70.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           16/edge      #54      +/-   ##
===========================================
+ Coverage    49.42%   50.89%   +1.47%     
===========================================
  Files            4        4              
  Lines          864      890      +26     
  Branches       101      105       +4     
===========================================
+ Hits           427      453      +26     
  Misses         417      417              
  Partials        20       20              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@marceloneppel marceloneppel changed the title fix: handle persistent storage in set_up_database to prevent ObjectInUse errors [DPE-8747] Handle persistent storage in set_up_database to prevent ObjectInUse errors Dec 12, 2025
@marceloneppel marceloneppel marked this pull request as ready for review December 12, 2025 19:29
Copy link
Contributor

@onkolahmet onkolahmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just added some comments for clarity.

Comment on lines 1141 to 1145
logger.info(
"Fixed permissions on temp tablespace directory (persistent storage), "
"existing tablespace remains valid"
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small thought, on persistent storage, we're assuming the existing tablespace is still valid. Should we add a sanity check that it's still at temp_location? Or is this guaranteed by design? In other words, could we include the actual temp_location path in the log message? Would make debugging easier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Added the temp_location to the log message on 32416d3.

Regarding "is this guaranteed by design": Yes, the tablespace location is guaranteed to be at temp_location by design. PostgreSQL maintains this in its catalog, and the charm always uses the same path. The edge case where the tablespace doesn't exist yet (reboot during initialization) is already handled by the early return at line 1121.

Comment on lines +1127 to +1128
new_name = f"temp_{datetime.now(timezone.utc).strftime('%Y%m%d%H%M%S')}"
cursor.execute(f"ALTER TABLESPACE temp RENAME TO {new_name};")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we handle if two renames somehow happen in the same second? Probably super rare, but maybe worth documenting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Added documentation explaining why timestamp collision cannot occur: be8f00e.

Two renames cannot happen in the same second because consuming charms ensure leader-only execution (see postgresql-operator example), and PostgreSQL serializes operations through exclusive locks on the tablespace object.

Add temp_location parameter to the log message in _handle_temp_tablespace_on_reboot()
when handling persistent storage scenarios. This provides better observability and
makes debugging easier by showing operators the exact directory path where permissions
were fixed.

Addresses feedback from PR review comment:
#54 (comment)

Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Add comment documenting why timestamp collision cannot occur during temp
tablespace rename operations on tmpfs storage.

The comment clarifies that consuming charms ensure leader-only execution and
PostgreSQL serializes operations through exclusive locks, making second-precision
timestamps sufficient for uniqueness.

Addresses feedback from PR review comment:
#54 (comment)

Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
…tablespace-management

Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
@marceloneppel marceloneppel merged commit f6cb835 into 16/edge Dec 16, 2025
6 of 7 checks passed
@marceloneppel marceloneppel deleted the dpe-8747-fix-temp-tablespace-management branch December 16, 2025 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants