-
Notifications
You must be signed in to change notification settings - Fork 209
feat: Add adb-coding-assistants-cluster module #227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dgokeeffe
wants to merge
3
commits into
databricks:main
Choose a base branch
from
dgokeeffe:pr/add-coding-assistants-cluster
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| .PHONY: docs test_docs | ||
|
|
||
| docs: | ||
| terraform-docs -c ../../.terraform-docs.yml . | ||
|
|
||
| test_docs: | ||
| terraform-docs -c ../../.terraform-docs.yml --output-check . |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,363 @@ | ||||||||
| # Provisioning Databricks Cluster with Claude Code CLI | ||||||||
|
|
||||||||
| This template provides a self-contained deployment of a Databricks cluster pre-configured with Claude Code CLI for AI-assisted development directly on the cluster. | ||||||||
|
|
||||||||
| ## What Gets Deployed | ||||||||
|
|
||||||||
| * Unity Catalog Volume for init script storage | ||||||||
| * Databricks cluster with Claude Code CLI auto-installed on startup | ||||||||
| * MLflow experiment for tracing Claude Code sessions | ||||||||
| * Bash helper functions for easy usage | ||||||||
|
|
||||||||
| ## How to use | ||||||||
|
|
||||||||
| 1. Copy `terraform.tfvars.example` to `terraform.tfvars` | ||||||||
| 2. Update `terraform.tfvars` with your values: | ||||||||
| - `databricks_resource_id`: Your Azure Databricks workspace resource ID | ||||||||
| - `cluster_name`: Name for your cluster | ||||||||
| - `catalog_name`: Unity Catalog name to use | ||||||||
| 3. (Optional) Customize cluster configuration in `terraform.tfvars` (node type, autoscaling, etc.) | ||||||||
| 4. (Optional) Configure your [remote backend](https://developer.hashicorp.com/terraform/language/settings/backends/azurerm) | ||||||||
| 5. Run `terraform init` to initialize terraform and get provider ready | ||||||||
| 6. Run `terraform plan` to review the resources that will be created | ||||||||
| 7. Run `terraform apply` to create the resources | ||||||||
|
|
||||||||
| ## Prerequisites | ||||||||
|
|
||||||||
| - Databricks workspace with Unity Catalog enabled | ||||||||
| - Unity Catalog with an existing catalog and schema | ||||||||
| - **Unity Catalog metastore must have a root storage credential configured** (required for volumes) | ||||||||
| - Permission to create clusters | ||||||||
| - (For Azure) Authenticated via `az login` or environment variables | ||||||||
| - Databricks Runtime 14.3 LTS or higher recommended | ||||||||
|
|
||||||||
| > **Note**: If you encounter an error about missing root storage credential, you need to configure the metastore's root storage credential first. See [Databricks documentation](https://docs.databricks.com/api-explorer/workspace/metastores/update) for details. | ||||||||
|
|
||||||||
| ## Post-Deployment | ||||||||
|
|
||||||||
| After the cluster starts, you can connect via SSH to use Claude Code and other development tools. | ||||||||
|
|
||||||||
| ### 1. Configure SSH Tunnel | ||||||||
|
|
||||||||
| Use the Databricks CLI to set up SSH access to your new cluster: | ||||||||
|
|
||||||||
| ```bash | ||||||||
| # Authenticate if needed | ||||||||
| databricks auth login --host https://your-workspace-url.cloud.databricks.com | ||||||||
|
|
||||||||
| # Set up SSH config (replace 'claude-dev' with your preferred alias) | ||||||||
| databricks ssh setup --name claude-dev | ||||||||
| # Select your cluster from the list when prompted | ||||||||
| ``` | ||||||||
|
|
||||||||
| This creates an entry in your `~/.ssh/config` file. | ||||||||
|
|
||||||||
| ### 2. Connect via VSCode or Cursor | ||||||||
|
|
||||||||
| 1. Install the **Remote - SSH** extension in VSCode or Cursor. | ||||||||
| 2. Open the Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`). | ||||||||
| 3. Select **Remote-SSH: Connect to Host**. | ||||||||
| 4. Choose `claude-dev` (or the alias you created). | ||||||||
| 5. Select **Linux** as the platform. | ||||||||
| 6. Once connected, open your persistent workspace folder: `/Workspace/Users/<your-email>/`. | ||||||||
|
|
||||||||
| > **Important: Work Storage Location** | ||||||||
| > ⚠️ **DO NOT use Databricks Repos (`/Repos/...`) for active development work.** Repos folders can be unreliable for persistent storage and may lose uncommitted changes during cluster restarts or sync operations. | ||||||||
| > | ||||||||
| > ✅ **Use `/Workspace/Users/<your-email>/` instead.** This location provides reliable persistent storage. You can use regular git commands to manage version control (see "Using Git in /Workspace" section below). | ||||||||
|
|
||||||||
| ### 3. Launch Claude Code | ||||||||
|
|
||||||||
| Open the terminal in your remote VSCode/Cursor session and run: | ||||||||
|
|
||||||||
| ```bash | ||||||||
| # 1. Load environment variables and helpers | ||||||||
| source ~/.bashrc | ||||||||
|
|
||||||||
| # 2. Enable MLflow tracing (optional but recommended) | ||||||||
| claude-tracing-enable | ||||||||
|
|
||||||||
| # 3. Start Claude Code | ||||||||
| claude | ||||||||
| ``` | ||||||||
|
|
||||||||
| **First-time setup tips:** | ||||||||
| - Claude will ask for file permissions; use `Shift+Tab` to auto-allow edits in the current directory. | ||||||||
| - If you need to refresh credentials, run `claude-refresh-token`. | ||||||||
|
|
||||||||
| ### 4. Remote Web App Development (Port Forwarding) | ||||||||
|
|
||||||||
| VSCode and Cursor automatically forward ports. For example, to run a Streamlit app: | ||||||||
|
|
||||||||
| 1. Create `app.py`: | ||||||||
| ```python | ||||||||
| import streamlit as st | ||||||||
| st.title("Databricks Remote App") | ||||||||
| st.write("Running on cluster!") | ||||||||
| ``` | ||||||||
| 2. Run it: | ||||||||
| ```bash | ||||||||
| streamlit run app.py --server.port 8501 | ||||||||
| ``` | ||||||||
| 3. Click "Open in Browser" in the popup notification to view it at `localhost:8501`. | ||||||||
|
|
||||||||
| ### 5. Using the Databricks Python Interpreter | ||||||||
|
|
||||||||
| You don't need to configure a virtual environment. Databricks manages it for you. | ||||||||
|
|
||||||||
| 1. In the remote terminal, find the python path: | ||||||||
| ```bash | ||||||||
| echo $DATABRICKS_VIRTUAL_ENV | ||||||||
| # Output example: /local_disk0/.ephemeral_nfs/envs/pythonEnv-xxxx/bin/python | ||||||||
| ``` | ||||||||
| 2. In VSCode/Cursor, open the Command Palette and select **Python: Select Interpreter**. | ||||||||
| 3. Paste the path from above. | ||||||||
|
|
||||||||
| ### 6. Persistent Sessions with tmux | ||||||||
|
|
||||||||
| To keep your agent running even if you disconnect: | ||||||||
|
|
||||||||
| ```bash | ||||||||
| # Start a new session | ||||||||
| tmux new -s claude-session | ||||||||
|
|
||||||||
| # Detach (Ctrl+B, then D) | ||||||||
| # Reattach later | ||||||||
| tmux attach -t claude-session | ||||||||
| ``` | ||||||||
|
|
||||||||
| This allows you to leave long-running tasks (like "Build a data pipeline") executing on the cluster while you are offline. | ||||||||
|
|
||||||||
| ### 7. Using Git in /Workspace | ||||||||
|
|
||||||||
| Since `/Workspace` doesn't have native Repos integration, use standard git commands: | ||||||||
|
|
||||||||
| ```bash | ||||||||
| # Navigate to your workspace directory | ||||||||
| cd /Workspace/Users/<your-email>/ | ||||||||
|
|
||||||||
| # Option 1: Clone an existing repository | ||||||||
| git clone https://github.com/your-org/your-repo.git | ||||||||
| cd your-repo | ||||||||
|
|
||||||||
| # Option 2: Initialize a new repository | ||||||||
| mkdir my-project && cd my-project | ||||||||
| git init | ||||||||
| git remote add origin https://github.com/your-org/your-repo.git | ||||||||
|
|
||||||||
| # Configure git (first time only) | ||||||||
| git config user.name "Your Name" | ||||||||
| git config user.email "your.email@company.com" | ||||||||
|
|
||||||||
| # Regular git workflow | ||||||||
| git add . | ||||||||
| git commit -m "Your commit message" | ||||||||
| git push origin main | ||||||||
| ``` | ||||||||
|
|
||||||||
| **Git Authentication Options:** | ||||||||
|
|
||||||||
| 1. **Personal Access Token (PAT)** - Recommended: | ||||||||
| ```bash | ||||||||
| # GitHub: Create at https://github.com/settings/tokens | ||||||||
| # Use token as password when prompted | ||||||||
| git clone https://github.com/your-org/repo.git | ||||||||
| ``` | ||||||||
|
|
||||||||
| 2. **SSH Keys**: | ||||||||
| ```bash | ||||||||
| # Generate SSH key on the cluster | ||||||||
| ssh-keygen -t ed25519 -C "your.email@company.com" | ||||||||
|
|
||||||||
| # Add to GitHub: Copy output and add at https://github.com/settings/keys | ||||||||
| cat ~/.ssh/id_ed25519.pub | ||||||||
|
|
||||||||
| # Clone using SSH | ||||||||
| git clone git@github.com:your-org/repo.git | ||||||||
| ``` | ||||||||
|
|
||||||||
| 3. **Git Credential Manager**: | ||||||||
| ```bash | ||||||||
| # Store credentials to avoid repeated prompts | ||||||||
| git config --global credential.helper store | ||||||||
| ``` | ||||||||
|
|
||||||||
| ## Helper Commands | ||||||||
|
|
||||||||
| ### Claude CLI Commands | ||||||||
|
|
||||||||
| | Command | Purpose | | ||||||||
| |---------|---------| | ||||||||
| | `check-claude` | Verify Claude CLI installation and configuration | | ||||||||
| | `claude-debug` | Show detailed Claude configuration | | ||||||||
| | `claude-refresh-token` | Regenerate Claude settings from environment | | ||||||||
| | `claude-token-status` | Check token freshness and auto-refresh status | | ||||||||
| | `claude-tracing-enable` | Enable MLflow tracing for Claude sessions | | ||||||||
| | `claude-tracing-status` | Check tracing status | | ||||||||
| | `claude-tracing-disable` | Disable tracing | | ||||||||
|
|
||||||||
| ### Git Workspace Commands | ||||||||
|
|
||||||||
| | Command | Purpose | | ||||||||
| |---------|---------| | ||||||||
| | `git-workspace-init` | Interactive setup for git in /Workspace (clone or init) | | ||||||||
| | `git-workspace-check` | Verify location and check for uncommitted/unpushed changes | | ||||||||
| | `git-workspace-setup-auth` | Configure git authentication (PAT, SSH, or credential helper) | | ||||||||
|
|
||||||||
| These helpers warn you if working in `/Repos` and ensure your work is backed up in git. | ||||||||
|
|
||||||||
| ### VS Code/Cursor Remote Commands | ||||||||
|
|
||||||||
| | Command | Purpose | | ||||||||
| |---------|---------| | ||||||||
| | `claude-vscode-setup` | Show Remote SSH setup instructions | | ||||||||
| | `claude-vscode-env` | Get Python interpreter path for IDE | | ||||||||
| | `claude-vscode-check` | Verify Remote SSH configuration | | ||||||||
| | `claude-vscode-config` | Generate settings.json snippet | | ||||||||
|
|
||||||||
| ## Offline Installation | ||||||||
|
|
||||||||
| For air-gapped or restricted network environments, use the separate offline module: [`adb-coding-assistants-cluster-offline`](../../modules/adb-coding-assistants-cluster-offline/README.md). See the [Offline Installation Guide](../../modules/adb-coding-assistants-cluster-offline/scripts/OFFLINE-INSTALLATION.md) for detailed instructions. | ||||||||
|
|
||||||||
|
Comment on lines
+218
to
+221
|
||||||||
| ## Offline Installation | |
| For air-gapped or restricted network environments, use the separate offline module: [`adb-coding-assistants-cluster-offline`](../../modules/adb-coding-assistants-cluster-offline/README.md). See the [Offline Installation Guide](../../modules/adb-coding-assistants-cluster-offline/scripts/OFFLINE-INSTALLATION.md) for detailed instructions. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example README documents
git-workspace-*helper commands, but the provided init scripts don’t install them (they’re not present ininstall-claude.sh). Either add these helpers to the init script(s) or remove this section to avoid broken guidance.