Skip to content

Commit 8c176eb

Browse files
committed
cp
1 parent 38a8465 commit 8c176eb

File tree

92 files changed

+12465
-4795
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

92 files changed

+12465
-4795
lines changed

packages/skydeck/CLAUDE.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# SkyDeck Development Guidelines
2+
3+
## SkyPilot Integration
4+
5+
**IMPORTANT**: Always use the SkyPilot API for accessing job and cluster information. Never query the SkyPilot SQLite databases directly (e.g., `~/.sky/jobs.db`, `~/.sky/state.db`).
6+
7+
### API Access
8+
9+
- API endpoint: `https://skypilot-api.softmax-research.net`
10+
- Authentication: OAuth2 cookies stored in `~/.sky/cookies.txt`
11+
- Configuration: `~/.sky/config.yaml`
12+
13+
### Why Use the API
14+
15+
1. **Centralized data**: The API provides access to managed jobs across all users and clusters
16+
2. **Up-to-date information**: The API reflects the current state of the jobs controller
17+
3. **Proper abstractions**: The API provides structured data with proper types
18+
4. **Security**: Direct database access bypasses authentication and auditing
19+
20+
## Data Model
21+
22+
### Experiment Groups
23+
- **Created by**: User (via UI)
24+
- **Purpose**: Organize experiments into logical groups
25+
- **Contains**: Ordered list of experiments (many-to-many relationship)
26+
- **Fields**: id, name, flags (columns to display), order, collapsed
27+
28+
### Experiments
29+
- **Created by**: User (via "Create" button or "Duplicate")
30+
- **Purpose**: Configuration template that defines what to run
31+
- **Key fields**:
32+
- `id`: Auto-increment integer (internal)
33+
- `name`: Unique string identifier (user-facing, used for matching jobs/checkpoints)
34+
- `desired_state`: RUNNING or STOPPED
35+
- `current_state`: Reflects latest job status
36+
- `flags`: Configuration key-value pairs
37+
- `nodes`, `gpus`: Resource requirements
38+
39+
### Jobs
40+
- **Created by**: Synced from SkyPilot API (poller)
41+
- **Purpose**: Track actual job executions
42+
- **Matching**: Jobs display under experiments where `job.experiment_id == experiment.name`
43+
- **Key fields**: id, experiment_id (matches experiment.name), status, command, git_ref, nodes, gpus
44+
45+
### Checkpoints
46+
- **Created by**: Synced from S3 (syncer)
47+
- **Purpose**: Track model checkpoints and replay files
48+
- **S3 path**: `s3://softmax-public/policies/{experiment.name}/`
49+
- **Key fields**: experiment_id (references experiment.id), epoch, model_path, replay_paths, policy_version
50+
51+
### Key Relationships
52+
- **Jobs** match to experiments by **name**: `job.experiment_id == experiment.name`
53+
- **Checkpoints** are stored by **id**: `checkpoint.experiment_id == experiment.id`
54+
- **S3 paths** use experiment **name** (e.g., `s3://.../{experiment.name}/`)
55+
- If no experiment exists with matching name, jobs appear as "orphaned"
56+
57+
## Database
58+
59+
**Database Location**: SkyDeck uses SQLite for persistent storage.
60+
61+
- **Default location**: `~/.skydeck/skydeck.db`
62+
- **Configuration**: Can be overridden with `--db-path` flag or `SKYDECK_DB_PATH` environment variable
63+
- **Schema**: Defined in `skydeck/database.py` with automatic migrations on startup
64+
65+
### Database Scripts
66+
67+
When working with the database directly:
68+
69+
```bash
70+
# Backfill checkpoint versions (example)
71+
uv run python -c "
72+
import asyncio
73+
from pathlib import Path
74+
from skydeck.backfill_versions import backfill_checkpoint_versions
75+
db_path = str(Path.home() / '.skydeck' / 'skydeck.db')
76+
asyncio.run(backfill_checkpoint_versions(db_path))
77+
"
78+
79+
# Query database directly
80+
sqlite3 ~/.skydeck/skydeck.db "SELECT COUNT(*) FROM experiments;"
81+
```
82+
83+
### Code Style
84+
85+
- Always use `uv` for pip and python operations
86+
- Imports should go at the top of the file if possible
87+
- Follow existing patterns in the codebase for consistency
88+
- **NEVER add fallbacks** - fix the underlying problem instead
89+
- When making backend changes, restart the server: the user must restart skydeck for changes to take effect
90+
91+
## Development Workflow
92+
93+
After making backend changes (Python), restart the server to pick up changes:
94+
```bash
95+
lsof -ti:8000 | xargs kill -9 2>/dev/null || true
96+
sleep 2
97+
nohup uv run skydeck --port 8000 > /tmp/skydeck.log 2>&1 &
98+
sleep 3
99+
curl -s http://localhost:8000/api/health | head -c 100 # Verify it's running
100+
```
101+
102+
After making frontend changes (TypeScript/React):
103+
```bash
104+
cd packages/skydeck/frontend
105+
npm run build # Builds to ../skydeck/static/
106+
```
107+
108+
**Important**: Always restart the backend yourself to test changes. Do not ask the user to restart.

projects/skydeck/IMPLEMENTATION_SUMMARY.md renamed to packages/skydeck/IMPLEMENTATION_SUMMARY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Complete web-based dashboard for managing SkyPilot experiments with declarative
99
## Project Structure
1010

1111
```
12-
projects/skydeck/
12+
packages/skydeck/
1313
├── skydeck/ # Main package
1414
│ ├── __init__.py # Package initialization
1515
│ ├── __main__.py # Entry point for python -m skydeck
@@ -226,7 +226,7 @@ Creates a 2×4 grid over nodes × layers (8 experiments total).
226226

227227
### Installation
228228
```bash
229-
cd projects/skydeck
229+
cd packages/skydeck
230230
uv pip install -e .
231231
```
232232

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ Click any row to expand it. The right side shows all past job runs with:
140140
See `examples/grid_search.py` for creating multiple experiments at once:
141141

142142
```bash
143-
uv run python projects/skydeck/examples/grid_search.py
143+
uv run python packages/skydeck/examples/grid_search.py
144144
```
145145

146146
This creates a 2×4 grid over nodes × layers and starts them all!
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Web-based dashboard and controller for managing SkyPilot experiments with declar
1515

1616
```bash
1717
# Install dependencies
18-
cd projects/skydeck
18+
cd packages/skydeck
1919
uv pip install -e .
2020

2121
# Run the dashboard
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)