Durable Object reset loop prevents gateway startup after rapid deployments

## Issue
When deploying moltworker to Cloudflare Workers, rapid deployments (multiple deployments within 5-10 minutes) cause a Durable Object reset loop that prevents the OpenClaw gateway from starting.

## Error Messages
```
Failed to start process: Error: Durable Object reset because its code was updated.
[PROXY] Failed to start Moltbot: Error: Durable Object reset because its code was updated.
```

## Environment
- **Platform**: Cloudflare Workers with Durable Objects + Container bindings
- **OpenClaw Version**: 2026.2.3-1
- **Moltworker**: Based on cloudflare/moltworker architecture
- **Container**: Docker with `openclaw gateway` running in Cloudflare Sandbox

## Steps to Reproduce
1. Deploy moltworker to Cloudflare
2. Wait for gateway to start successfully
3. Deploy again within 5 minutes (e.g., bug fix or feature change)
4. Deploy a third time within another 5 minutes
5. Observe: Gateway fails to start with "Durable Object reset" errors in a loop

## Expected Behavior
Gateway should recover gracefully Gateway should recover gracefully Gateway should recover gracefully Gateway should recover gracefully Gateway should recover gracefully Gateway should recover gracefully Gateway should recoveris interrupted by another DO reset
- Gateway never becomes ready on port 18789
- Process times out after 90 seconds
- Only resolves after waiting 5-10+ minutes without any deployments

## Impact
- Production downtime during multiple deployments
- Cannot do rapid iteration/bug fixes in production
- Data is safe (R2 backup/restore works correctly), but service is unavailable during reset loop

## Workaround
Wait 5-10 minutes between deployments to allow the Durable Object to fully stabilize before deploying again.

## Proposed Solutions
1. **Better error handling**: Detect DO reset scenarios and retry with exponential backoff
2. **Startup s2. **Startup s2. **Startup s2. **Stas in progres2. **Startup s2. **Startup s2. **Startup s2. **Stas in progres2. **Startup s2. *guide (batch changes, avoid rapid deploys)
4. **Graceful degradation**: Return a "deployment in progress" status instead of timing out
5. **Gradual rollouts**: Consider using Workers `deploy_config.version_id` for canary deployments

## Additional Context
- Using R2 for persistent storage (config, skills, conversations)
- R2 restore completes successfully before the reset occurs
- The issue is purely with the Durable Object lifecycle during rapid code updates
- This appears to be a Cloudflare platform limitation, but better handling would improve the deployment experience

## Related
This might be related to how Durable Objects handle `alarm()` during code updates - our KeepAlive DO pings the Sandbox DO every 30 seconds, which may interact poorly with deployments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Durable Object reset loop prevents gateway startup after rapid deployments #238

Issue

Error Messages

Environment

Steps to Reproduce

Expected Behavior

Impact

Workaround

Proposed Solutions

Additional Context

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Durable Object reset loop prevents gateway startup after rapid deployments #238

Description

Issue

Error Messages

Environment

Steps to Reproduce

Expected Behavior

Impact

Workaround

Proposed Solutions

Additional Context

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions