pg_rewind during failover might have permission issues

I ran AI to perform a test:
```
Summary

   I completed the multi-region Azure deployment workflow:

   ┌──────────────────────────────────────┬────────┬──────────────────────────────────────────────────────────────────────────────────┐
   │ Step                                 │ Status │ Notes                                                                            │
   ├──────────────────────────────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
   │ 1. Deploy AKS Fleet                  │ ✅     │ Created 3 AKS clusters (eastus2, uksouth, westus3) with KubeFleet                │
   ├──────────────────────────────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
   │ 2. Install cert-manager              │ ✅     │ Installed on all member clusters                                                 │
   ├──────────────────────────────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
   │ 3. Install DocumentDB operator       │ ✅     │ Helm chart deployed to all 3 clusters                                            │
   ├──────────────────────────────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
   │ 4. Deploy multi-region DocumentDB    │ ✅     │ 3-node cluster with HA deployed (experienced failover during demo)               │
   ├──────────────────────────────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
   │ 5. Test connection                   │ ✅     │ Connected successfully, but writes timed out due to cluster failover             │
   ├──────────────────────────────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
   │ 6. Clean up demo resources           │ ✅     │ Deleted namespaces on all clusters                                               │
   ├──────────────────────────────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
   │ 7. Clean up operators/infrastructure │ ✅     │ Uninstalled operators, cert-manager, CRDs, and initiated resource group deletion │
   └──────────────────────────────────────┴────────┴──────────────────────────────────────────────────────────────────────────────────┘

   Key observations:

     - The multi-region deployment worked, creating DocumentDB clusters across 3 Azure regions
     - A pod restart triggered a CNPG failover, causing temporary write unavailability (expected HA behavior)
     - The pg_rewind operation failed due to a permission issue with documentdb_core schema - this may be a known issue
     - All Azure resources are being deleted (resource group deletion is async)
```

When chatting with @alaye-ms  about the result he suggested:
_Sometimes it prints that error before the grant runs
It probably latched on to that since once the GRANT does run, it doesn't print anything
At least that would be my guess, could be a real issue_
 
We need to:
* make the output nicer (this error could confuse a real user, too, - not just AI)
* validate it's not a real bug (there was another fialover which seemed to have worked so the misleading printout theory is strong)
* if it's a real ssue, fix it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pg_rewind during failover might have permission issues #223

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

pg_rewind during failover might have permission issues #223

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions