Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 110 additions & 0 deletions docs/remote_simulation_workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Remote Simulation & Training Workflow

This guide outlines how to develop, train, and test AlohaMini using a remote cloud GPU (e.g., Lambda Labs, AWS) for the heavy simulation, while controlling everything from your local laptop (e.g., MacBook).

## Architecture

* **Cloud Server (The "Lab")**: Runs NVIDIA Isaac Sim. Handles physics, rendering, and training.
* **Local Machine (The "Mission Control")**: Runs the Dashboard and Teleoperation scripts. Connects to the cloud via SSH.

## Prerequisites

1. **Cloud Instance**: A server with an NVIDIA RTX GPU (A10, A100, RTX 3090/4090).
* Recommended: Lambda Labs or Brev.dev (Ubuntu 20.04/22.04).
* Must have **NVIDIA Drivers** and **Isaac Sim** installed (or use the Isaac Sim Docker container).
2. **Local Machine**: Your laptop (Mac/Windows/Linux).
3. **SSH Access**: You must be able to SSH into the cloud instance.

## Setup

### 1. Cloud Server Setup
1. SSH into your cloud instance.
2. Clone this repository:
```bash
git clone https://github.com/blankey1337/AlohaMini.git
cd AlohaMini
```
3. Ensure you are in the python environment that has access to Isaac Sim (often `./python.sh` in the Isaac Sim folder).

### 2. Local Machine Setup
1. Clone this repository locally.
2. Install dependencies:
```bash
pip install -r software/requirements.txt
```

## The Workflow

### Phase 1: Data Collection

1. **Start the Simulation (Cloud)**
Run the simulation environment script. This listens on ports 5555 (Cmd) and 5556 (Obs).
```bash
# On Cloud
isaac_sim_python software/examples/alohamini/isaac_sim/isaac_alohamini_env.py
```

2. **Establish Connection (Local)**
Forward the ZMQ ports from the cloud to your localhost.
```bash
# On Local Mac
ssh -L 5555:localhost:5555 -L 5556:localhost:5556 ubuntu@<CLOUD_IP>
```

3. **Launch Dashboard (Local)**
Start the web dashboard to see what the robot sees.
```bash
# On Local Mac
python software/dashboard/app.py
```
Open `http://localhost:5001` in your browser.

4. **Teleoperate & Record**
* Use the Dashboard to see the camera feed.
* Run the teleop script in another terminal to control the robot with your keyboard:
```bash
python software/examples/alohamini/standalone_teleop.py --ip 127.0.0.1
```
* **To Record**: Click the **"Start Recording"** button on the Dashboard.
* Perform the task (e.g., pick up the object).
* Click **"Stop Recording"**.
* Repeat 50-100 times. The data is saved to `AlohaMini/data_sim/` on the **Cloud Server**.

### Phase 2: Training

You train the model directly on the Cloud GPU where the data lives.

1. **Stop the Simulation** (to free up GPU VRAM).
2. **Run Training**:
Use the LeRobot training script (or your custom training script) pointing to the generated dataset.
```bash
# On Cloud
python software/src/lerobot/scripts/train.py \
--dataset data_sim \
--policy act \
--batch_size 8 \
--num_epochs 1000
```
*Note: Exact training command depends on the LeRobot configuration.*

3. **Output**: This produces a model file (e.g., `outputs/train/policy.safetensors`).

### Phase 3: Evaluation

Test the trained model in the simulator to see if it works.

1. **Restart Simulation (Cloud)**:
```bash
isaac_sim_python software/examples/alohamini/isaac_sim/isaac_alohamini_env.py
```
2. **Run Inference Node (Cloud or Local)**:
You need a script that loads the model and closes the loop (reads obs -> runs model -> sends action).
* *Coming Soon: `eval_sim.py` which loads the safetensor and drives the ZMQ robot.*

3. **Watch (Local)**:
Use the Dashboard to watch the robot perform the task autonomously.

## Troubleshooting

* **Laggy Video**: ZMQ over SSH tunneling is usually fast enough for 640x480, but if it lags, check your internet connection speed to the cloud server.
* **"Address already in use"**: Ensure no other python scripts are using ports 5555/5556 on either machine.
42 changes: 39 additions & 3 deletions software/dashboard/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,32 @@
import zmq
import cv2
import numpy as np
from flask import Flask, render_template, Response, jsonify
from flask import Flask, render_template, Response, jsonify, request

app = Flask(__name__)

# Global state
latest_observation = {}
lock = threading.Lock()
connected = False
recording = False
cmd_socket = None

def zmq_worker(ip='127.0.0.1', port=5556):
global latest_observation, connected
def zmq_worker(ip='127.0.0.1', port=5556, cmd_port=5555):
global latest_observation, connected, cmd_socket
context = zmq.Context()

# Sub Socket
socket = context.socket(zmq.SUB)
socket.setsockopt(zmq.SUBSCRIBE, b"")
socket.connect(f"tcp://{ip}:{port}")
socket.setsockopt(zmq.CONFLATE, 1)

# Cmd Socket (Push to Sim)
cmd_socket = context.socket(zmq.PUSH)
cmd_socket.setsockopt(zmq.CONFLATE, 1)
cmd_socket.connect(f"tcp://{ip}:{cmd_port}")

print(f"Connecting to ZMQ Stream at {ip}:{port}...")

while True:
Expand Down Expand Up @@ -93,6 +102,33 @@ def video_feed(camera_name):
return Response(generate_frames(camera_name),
mimetype='multipart/x-mixed-replace; boundary=frame')

@app.route('/api/command', methods=['POST'])
def send_command():
global cmd_socket
if not request.json or 'command' not in request.json:
return jsonify({'error': 'No command provided'}), 400

cmd = request.json['command']
print(f"Received command: {cmd}")

# Example handling
if cmd == 'reset_sim':
# Send reset command (Isaac Sim needs to handle this logic)
# For now, we can just zero out velocities or send a special flag
if cmd_socket:
cmd_socket.send_string(json.dumps({"reset": True}))

elif cmd == 'start_recording':
# Trigger recording logic
if cmd_socket:
cmd_socket.send_string(json.dumps({"start_recording": True}))

elif cmd == 'stop_recording':
if cmd_socket:
cmd_socket.send_string(json.dumps({"stop_recording": True}))

return jsonify({'status': 'ok'})

@app.route('/api/status')
def get_status():
with lock:
Expand Down
64 changes: 63 additions & 1 deletion software/dashboard/templates/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,27 @@
.camera-box img { max-width: 100%; height: auto; display: block; }
.camera-title { text-align: center; font-size: 0.9em; margin-bottom: 5px; }
.status-panel { flex: 1; background: #333; padding: 15px; border-radius: 5px; min-width: 300px; }
.controls-panel { width: 100%; background: #333; padding: 15px; border-radius: 5px; margin-top: 20px; }
table { width: 100%; border-collapse: collapse; }
td, th { padding: 5px; border-bottom: 1px solid #444; font-size: 0.9em; }
th { text-align: left; color: #aaa; }
.value { font-family: monospace; color: #4f4; }
.btn { padding: 10px 20px; font-size: 16px; margin-right: 10px; cursor: pointer; background: #555; color: white; border: none; border-radius: 4px; }
.btn:hover { background: #777; }
.btn-record { background: #c00; }
.btn-record:hover { background: #e00; }
.status-indicator { display: inline-block; width: 10px; height: 10px; border-radius: 50%; margin-right: 5px; }
.status-active { background: #0f0; }
.status-inactive { background: #555; }
</style>
</head>
<body>
<h1>AlohaMini Dashboard</h1>
<div style="display:flex; justify-content:space-between; align-items:center;">
<h1>AlohaMini Dashboard</h1>
<div>
<span id="connectionStatus" class="status-indicator status-inactive"></span> <span id="connectionText">Disconnected</span>
</div>
</div>

<div class="container">
<div class="camera-grid" id="cameraGrid">
Expand All @@ -33,14 +46,63 @@ <h3>Robot State</h3>
</div>
</div>

<div class="controls-panel">
<h3>Simulation Controls</h3>
<p>Control the remote simulation directly from here.</p>
<button class="btn" onclick="sendCommand('reset_sim')">Reset Simulation</button>
<button class="btn btn-record" id="recordBtn" onclick="toggleRecording()">Start Recording</button>
<span id="recordingStatus" style="margin-left: 10px; color: #aaa;"></span>
</div>

<script>
const knownCameras = ['head_top', 'head_back', 'head_front', 'wrist_left', 'wrist_right', 'cam_high', 'cam_low', 'cam_left_wrist', 'cam_right_wrist'];
const activeCameras = new Set();

let isRecording = false;

function toggleRecording() {
isRecording = !isRecording;
const btn = document.getElementById('recordBtn');
const status = document.getElementById('recordingStatus');

if (isRecording) {
btn.textContent = "Stop Recording";
btn.style.background = "#555";
status.textContent = "Recording in progress...";
sendCommand('start_recording');
} else {
btn.textContent = "Start Recording";
btn.style.background = "#c00";
status.textContent = "Saved.";
sendCommand('stop_recording');
setTimeout(() => status.textContent = "", 3000);
}
}

function sendCommand(cmd) {
console.log("Sending command:", cmd);
fetch('/api/command', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ command: cmd })
});
}

function updateStatus() {
fetch('/api/status')
.then(response => response.json())
.then(data => {
// Update connection indicator
const connInd = document.getElementById('connectionStatus');
const connText = document.getElementById('connectionText');
if (data.connected) {
connInd.className = 'status-indicator status-active';
connText.textContent = "Connected";
} else {
connInd.className = 'status-indicator status-inactive';
connText.textContent = "Disconnected";
}

const table = document.getElementById('statusTable');
let rows = '';

Expand Down
83 changes: 83 additions & 0 deletions software/examples/alohamini/isaac_sim/isaac_alohamini_env.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import base64
import json
import os
import random
import sys
import time
from datetime import datetime

import cv2
import numpy as np
Expand Down Expand Up @@ -40,6 +43,54 @@
# Locate URDF
URDF_PATH = os.path.join(repo_root, "software/src/lerobot/robots/alohamini/alohamini.urdf")

class DatasetRecorder:
def __init__(self, root_dir="data"):
self.root_dir = root_dir
self.is_recording = False
self.current_episode_dir = None
self.frame_idx = 0
self.episode_idx = 0

if not os.path.exists(root_dir):
os.makedirs(root_dir)

def start_recording(self):
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
self.current_episode_dir = os.path.join(self.root_dir, f"episode_{timestamp}")
os.makedirs(self.current_episode_dir)
os.makedirs(os.path.join(self.current_episode_dir, "images"))
self.frame_idx = 0
self.is_recording = True
print(f"Started recording to {self.current_episode_dir}")

def stop_recording(self):
if self.is_recording:
print(f"Stopped recording. Saved {self.frame_idx} frames.")
self.is_recording = False
self.current_episode_dir = None

def save_frame(self, obs, action):
if not self.is_recording:
return

# Save JSON data (state + action)
data = {
"timestamp": time.time(),
"observation": {k: v for k, v in obs.items() if not isinstance(v, np.ndarray)},
"action": action
}

with open(os.path.join(self.current_episode_dir, f"frame_{self.frame_idx:06d}.json"), "w") as f:
json.dump(data, f)

# Save Images
for k, v in obs.items():
if isinstance(v, np.ndarray): # Image
img_path = os.path.join(self.current_episode_dir, "images", f"{k}_{self.frame_idx:06d}.jpg")
cv2.imwrite(img_path, v)

self.frame_idx += 1

class IsaacAlohaMini:
def __init__(self, world, urdf_path):
self.world = world
Expand Down Expand Up @@ -188,6 +239,8 @@ def main():

print(f"Isaac Sim AlohaMini running. Ports: OBS={PORT_OBS}, CMD={PORT_CMD}")

recorder = DatasetRecorder(root_dir="data_sim")

while simulation_app.is_running():
world.step(render=True)

Expand All @@ -203,7 +256,23 @@ def main():
joint_cmds = {}
vx, vy, vth = 0, 0, 0

# Check for system commands
if "start_recording" in cmd:
recorder.start_recording()
continue
if "stop_recording" in cmd:
recorder.stop_recording()
continue

for k, v in cmd.items():
if k == "reset" and v is True:
# Reset
print("Resetting robot...")
world.reset()
joint_cmds = {name: 0.0 for name in aloha.dof_names}
aloha.set_joint_positions(joint_cmds)
continue

if k.endswith(".pos"):
joint_name = k.replace(".pos", "")
joint_cmds[joint_name] = v
Expand All @@ -226,6 +295,20 @@ def main():

# 2. Get Obs & Publish
obs = aloha.get_observations()

# Save frame if recording
# (Pass current command as action label for now, though it's imperfect as it's the *commanded* not *measured* action)
if recorder.is_recording:
# Reconstruct action dict from parsed values
# This is a simplification; ideally we log exactly what we sent
action_log = {
"x.vel": vx,
"y.vel": vy,
"theta.vel": vth,
# Add arm joint targets if we had them easily accessible here
}
recorder.save_frame(obs, action_log)

encoded_obs = {}

# Process images
Expand Down