Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
*.tfstate.lock.
*.terraform.lock.hcl

# Terraform plan files
*.plan

# logs
*.log

Expand All @@ -22,6 +25,8 @@

# Ignored Terraform files
*gitignore*.tf
terraform.tfvars
!terraform.tfvars.example

# Ignore Mac .DS_Store files
.DS_Store
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ The folder `examples` contains the following Terraform implementation examples :
| Azure | [adb-uc](examples/adb-uc/) | ADB Unity Catalog Process |
| Azure | [adb-unity-catalog-basic-demo](examples/adb-unity-catalog-basic-demo/) | ADB Unity Catalog end-to-end demo including UC metastore setup, Users/groups sync from AAD to databricks account, UC Catalog, External locations, Schemas, & Access Grants |
| Azure | [adb-overwatch](examples/adb-overwatch/) | Overwatch multi-workspace deployment on Azure |
| Azure | [adb-coding-assistants-cluster](examples/adb-coding-assistants-cluster/) | Databricks cluster with Claude Code CLI for AI-assisted development |
| AWS | [aws-workspace-basic](examples/aws-workspace-basic/) | Provisioning AWS Databricks E2 |
| AWS | [aws-workspace-with-firewall](examples/aws-workspace-with-firewall/) | Provisioning AWS Databricks E2 with an AWS Firewall |
| AWS | [aws-exfiltration-protection](examples/aws-exfiltration-protection/) | An implementation of [Data Exfiltration Protection on AWS](https://www.databricks.com/blog/2021/02/02/data-exfiltration-protection-with-databricks-on-aws.html) |
Expand Down
7 changes: 7 additions & 0 deletions examples/adb-coding-assistants-cluster/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.PHONY: docs test_docs

docs:
terraform-docs -c ../../.terraform-docs.yml .

test_docs:
terraform-docs -c ../../.terraform-docs.yml --output-check .
363 changes: 363 additions & 0 deletions examples/adb-coding-assistants-cluster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,363 @@
# Provisioning Databricks Cluster with Claude Code CLI

This template provides a self-contained deployment of a Databricks cluster pre-configured with Claude Code CLI for AI-assisted development directly on the cluster.

## What Gets Deployed

* Unity Catalog Volume for init script storage
* Databricks cluster with Claude Code CLI auto-installed on startup
* MLflow experiment for tracing Claude Code sessions
* Bash helper functions for easy usage

## How to use

1. Copy `terraform.tfvars.example` to `terraform.tfvars`
2. Update `terraform.tfvars` with your values:
- `databricks_resource_id`: Your Azure Databricks workspace resource ID
- `cluster_name`: Name for your cluster
- `catalog_name`: Unity Catalog name to use
3. (Optional) Customize cluster configuration in `terraform.tfvars` (node type, autoscaling, etc.)
4. (Optional) Configure your [remote backend](https://developer.hashicorp.com/terraform/language/settings/backends/azurerm)
5. Run `terraform init` to initialize terraform and get provider ready
6. Run `terraform plan` to review the resources that will be created
7. Run `terraform apply` to create the resources

## Prerequisites

- Databricks workspace with Unity Catalog enabled
- Unity Catalog with an existing catalog and schema
- **Unity Catalog metastore must have a root storage credential configured** (required for volumes)
- Permission to create clusters
- (For Azure) Authenticated via `az login` or environment variables
- Databricks Runtime 14.3 LTS or higher recommended

> **Note**: If you encounter an error about missing root storage credential, you need to configure the metastore's root storage credential first. See [Databricks documentation](https://docs.databricks.com/api-explorer/workspace/metastores/update) for details.

## Post-Deployment

After the cluster starts, you can connect via SSH to use Claude Code and other development tools.

### 1. Configure SSH Tunnel

Use the Databricks CLI to set up SSH access to your new cluster:

```bash
# Authenticate if needed
databricks auth login --host https://your-workspace-url.cloud.databricks.com

# Set up SSH config (replace 'claude-dev' with your preferred alias)
databricks ssh setup --name claude-dev
# Select your cluster from the list when prompted
```

This creates an entry in your `~/.ssh/config` file.

### 2. Connect via VSCode or Cursor

1. Install the **Remote - SSH** extension in VSCode or Cursor.
2. Open the Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`).
3. Select **Remote-SSH: Connect to Host**.
4. Choose `claude-dev` (or the alias you created).
5. Select **Linux** as the platform.
6. Once connected, open your persistent workspace folder: `/Workspace/Users/<your-email>/`.

> **Important: Work Storage Location**
> ⚠️ **DO NOT use Databricks Repos (`/Repos/...`) for active development work.** Repos folders can be unreliable for persistent storage and may lose uncommitted changes during cluster restarts or sync operations.
>
> ✅ **Use `/Workspace/Users/<your-email>/` instead.** This location provides reliable persistent storage. You can use regular git commands to manage version control (see "Using Git in /Workspace" section below).

### 3. Launch Claude Code

Open the terminal in your remote VSCode/Cursor session and run:

```bash
# 1. Load environment variables and helpers
source ~/.bashrc

# 2. Enable MLflow tracing (optional but recommended)
claude-tracing-enable

# 3. Start Claude Code
claude
```

**First-time setup tips:**
- Claude will ask for file permissions; use `Shift+Tab` to auto-allow edits in the current directory.
- If you need to refresh credentials, run `claude-refresh-token`.

### 4. Remote Web App Development (Port Forwarding)

VSCode and Cursor automatically forward ports. For example, to run a Streamlit app:

1. Create `app.py`:
```python
import streamlit as st
st.title("Databricks Remote App")
st.write("Running on cluster!")
```
2. Run it:
```bash
streamlit run app.py --server.port 8501
```
3. Click "Open in Browser" in the popup notification to view it at `localhost:8501`.

### 5. Using the Databricks Python Interpreter

You don't need to configure a virtual environment. Databricks manages it for you.

1. In the remote terminal, find the python path:
```bash
echo $DATABRICKS_VIRTUAL_ENV
# Output example: /local_disk0/.ephemeral_nfs/envs/pythonEnv-xxxx/bin/python
```
2. In VSCode/Cursor, open the Command Palette and select **Python: Select Interpreter**.
3. Paste the path from above.

### 6. Persistent Sessions with tmux

To keep your agent running even if you disconnect:

```bash
# Start a new session
tmux new -s claude-session

# Detach (Ctrl+B, then D)
# Reattach later
tmux attach -t claude-session
```

This allows you to leave long-running tasks (like "Build a data pipeline") executing on the cluster while you are offline.

### 7. Using Git in /Workspace

Since `/Workspace` doesn't have native Repos integration, use standard git commands:

```bash
# Navigate to your workspace directory
cd /Workspace/Users/<your-email>/

# Option 1: Clone an existing repository
git clone https://github.com/your-org/your-repo.git
cd your-repo

# Option 2: Initialize a new repository
mkdir my-project && cd my-project
git init
git remote add origin https://github.com/your-org/your-repo.git

# Configure git (first time only)
git config user.name "Your Name"
git config user.email "your.email@company.com"

# Regular git workflow
git add .
git commit -m "Your commit message"
git push origin main
```

**Git Authentication Options:**

1. **Personal Access Token (PAT)** - Recommended:
```bash
# GitHub: Create at https://github.com/settings/tokens
# Use token as password when prompted
git clone https://github.com/your-org/repo.git
```

2. **SSH Keys**:
```bash
# Generate SSH key on the cluster
ssh-keygen -t ed25519 -C "your.email@company.com"

# Add to GitHub: Copy output and add at https://github.com/settings/keys
cat ~/.ssh/id_ed25519.pub

# Clone using SSH
git clone git@github.com:your-org/repo.git
```

3. **Git Credential Manager**:
```bash
# Store credentials to avoid repeated prompts
git config --global credential.helper store
```

## Helper Commands

### Claude CLI Commands

| Command | Purpose |
|---------|---------|
| `check-claude` | Verify Claude CLI installation and configuration |
| `claude-debug` | Show detailed Claude configuration |
| `claude-refresh-token` | Regenerate Claude settings from environment |
| `claude-token-status` | Check token freshness and auto-refresh status |
| `claude-tracing-enable` | Enable MLflow tracing for Claude sessions |
| `claude-tracing-status` | Check tracing status |
| `claude-tracing-disable` | Disable tracing |

### Git Workspace Commands

| Command | Purpose |
|---------|---------|
| `git-workspace-init` | Interactive setup for git in /Workspace (clone or init) |
| `git-workspace-check` | Verify location and check for uncommitted/unpushed changes |
| `git-workspace-setup-auth` | Configure git authentication (PAT, SSH, or credential helper) |

These helpers warn you if working in `/Repos` and ensure your work is backed up in git.

Comment on lines +199 to +208
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example README documents git-workspace-* helper commands, but the provided init scripts don’t install them (they’re not present in install-claude.sh). Either add these helpers to the init script(s) or remove this section to avoid broken guidance.

Suggested change
### Git Workspace Commands
| Command | Purpose |
|---------|---------|
| `git-workspace-init` | Interactive setup for git in /Workspace (clone or init) |
| `git-workspace-check` | Verify location and check for uncommitted/unpushed changes |
| `git-workspace-setup-auth` | Configure git authentication (PAT, SSH, or credential helper) |
These helpers warn you if working in `/Repos` and ensure your work is backed up in git.

Copilot uses AI. Check for mistakes.
### VS Code/Cursor Remote Commands

| Command | Purpose |
|---------|---------|
| `claude-vscode-setup` | Show Remote SSH setup instructions |
| `claude-vscode-env` | Get Python interpreter path for IDE |
| `claude-vscode-check` | Verify Remote SSH configuration |
| `claude-vscode-config` | Generate settings.json snippet |

## Offline Installation

For air-gapped or restricted network environments, use the separate offline module: [`adb-coding-assistants-cluster-offline`](../../modules/adb-coding-assistants-cluster-offline/README.md). See the [Offline Installation Guide](../../modules/adb-coding-assistants-cluster-offline/scripts/OFFLINE-INSTALLATION.md) for detailed instructions.

Comment on lines +218 to +221
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example README references an adb-coding-assistants-cluster-offline module and an offline installation guide path that do not exist in this repo. Update or remove these links unless the offline module is added in the same PR.

Suggested change
## Offline Installation
For air-gapped or restricted network environments, use the separate offline module: [`adb-coding-assistants-cluster-offline`](../../modules/adb-coding-assistants-cluster-offline/README.md). See the [Offline Installation Guide](../../modules/adb-coding-assistants-cluster-offline/scripts/OFFLINE-INSTALLATION.md) for detailed instructions.

Copilot uses AI. Check for mistakes.
## Configuration Examples

### Single-Node Development Cluster

```hcl
cluster_mode = "SINGLE_NODE"
num_workers = 0
node_type_id = "Standard_D8pds_v6"
```

### Autoscaling Production Cluster

```hcl
cluster_mode = "STANDARD"
num_workers = null # Enable autoscaling
min_workers = 2
max_workers = 8
node_type_id = "Standard_D8pds_v6"
```

## Authentication

This example uses Databricks unified authentication. Authentication can be provided via:

1. **Azure CLI** (recommended for local development):
```bash
az login
terraform apply
```

2. **Environment Variables** (recommended for CI/CD):
```bash
export DATABRICKS_HOST="https://adb-xxx.azuredatabricks.net"
export DATABRICKS_TOKEN="dapi..."
terraform apply
```

3. **Configuration Profile**:
```bash
export DATABRICKS_CONFIG_PROFILE="my-profile"
terraform apply
```

For more details on authentication, see the [Databricks unified authentication documentation](https://docs.databricks.com/dev-tools/auth/unified-auth.html).

## Troubleshooting

### Init Script Fails

Check cluster event logs in the Databricks UI under **Compute** → **Your Cluster** → **Event Log**.

Common issues:
- Network connectivity to download packages
- Unity Catalog volume permissions
- Insufficient cluster permissions

### Claude Not Found After Login

```bash
# Reload bashrc
source ~/.bashrc

# Verify PATH
check-claude
```

### Authentication Issues

```bash
# Check environment variables
check-claude

# Regenerate configuration
claude-refresh-token
```

## Additional Resources

- [Scripts Documentation](scripts/README.md)
- [Databricks Init Scripts Documentation](https://docs.databricks.com/clusters/init-scripts.html)
- [Unity Catalog Volumes Documentation](https://docs.databricks.com/data-governance/unity-catalog/volumes.html)

<!-- BEGIN_TF_DOCS -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.0 |
| <a name="requirement_azurerm"></a> [azurerm](#requirement\_azurerm) | >=4.31.0 |
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | >=1.81.1 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | 4.57.0 |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [azurerm_client_config.current](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/client_config) | data source |
| [azurerm_databricks_workspace.this](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/databricks_workspace) | data source |
| [azurerm_resource_group.this](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/resource_group) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_catalog_name"></a> [catalog\_name](#input\_catalog\_name) | Unity Catalog name for the volume | `string` | n/a | yes |
| <a name="input_cluster_name"></a> [cluster\_name](#input\_cluster\_name) | Name of the Databricks cluster | `string` | n/a | yes |
| <a name="input_databricks_resource_id"></a> [databricks\_resource\_id](#input\_databricks\_resource\_id) | The Azure resource ID for the Databricks workspace. Format: /subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.Databricks/workspaces/{workspace-name} | `string` | n/a | yes |
| <a name="input_autotermination_minutes"></a> [autotermination\_minutes](#input\_autotermination\_minutes) | Minutes of inactivity before cluster auto-terminates | `number` | `30` | no |
| <a name="input_cluster_mode"></a> [cluster\_mode](#input\_cluster\_mode) | Cluster mode: STANDARD or SINGLE\_NODE | `string` | `"STANDARD"` | no |
| <a name="input_init_script_source_path"></a> [init\_script\_source\_path](#input\_init\_script\_source\_path) | Local path to the init script | `string` | `null` | no |
| <a name="input_max_workers"></a> [max\_workers](#input\_max\_workers) | Maximum number of workers for autoscaling | `number` | `3` | no |
| <a name="input_min_workers"></a> [min\_workers](#input\_min\_workers) | Minimum number of workers for autoscaling | `number` | `1` | no |
| <a name="input_mlflow_experiment_name"></a> [mlflow\_experiment\_name](#input\_mlflow\_experiment\_name) | MLflow experiment name for Claude Code tracing | `string` | `"/Workspace/Shared/claude-code-tracing"` | no |
| <a name="input_node_type_id"></a> [node\_type\_id](#input\_node\_type\_id) | Node type for the cluster. Default is Standard_D8pds_v6 (modern, premium SSD + local NVMe). If unavailable in your region, consider Standard_DS13_v2 as fallback. | `string` | `"Standard_D8pds_v6"` | no |
| <a name="input_num_workers"></a> [num\_workers](#input\_num\_workers) | Number of worker nodes (null for autoscaling) | `number` | `null` | no |
| <a name="input_schema_name"></a> [schema\_name](#input\_schema\_name) | Schema name for the volume | `string` | `"default"` | no |
| <a name="input_spark_version"></a> [spark\_version](#input\_spark\_version) | Databricks Runtime version | `string` | `"17.3.x-cpu-ml-scala2.13"` | no |
| <a name="input_tags"></a> [tags](#input\_tags) | Custom tags for the cluster | `map(string)` | <pre>{<br/> "Environment": "dev",<br/> "Purpose": "coding-assistants"<br/>}</pre> | no |
| <a name="input_volume_name"></a> [volume\_name](#input\_volume\_name) | Volume name to store init scripts | `string` | `"coding_assistants"` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_cluster_id"></a> [cluster\_id](#output\_cluster\_id) | The ID of the created cluster |
| <a name="output_cluster_name"></a> [cluster\_name](#output\_cluster\_name) | Name of the created cluster |
| <a name="output_cluster_url"></a> [cluster\_url](#output\_cluster\_url) | URL to access the cluster in Databricks UI |
| <a name="output_init_script_path"></a> [init\_script\_path](#output\_init\_script\_path) | Path to the init script in the volume |
| <a name="output_mlflow_experiment_name"></a> [mlflow\_experiment\_name](#output\_mlflow\_experiment\_name) | MLflow experiment name for tracing |
| <a name="output_setup_instructions"></a> [setup\_instructions](#output\_setup\_instructions) | Instructions for using the cluster |
| <a name="output_volume_full_name"></a> [volume\_full\_name](#output\_volume\_full\_name) | Full name of the volume |
| <a name="output_volume_path"></a> [volume\_path](#output\_volume\_path) | Path to the volume containing init scripts |
<!-- END_TF_DOCS -->
Loading