GitHub - dihmeetree/oxide: Rust-based tool for deploying production-ready Talos Linux Kubernetes clusters with the Cilium CNI.

A Rust-based tool for deploying Talos Linux Kubernetes clusters with the Cilium CNI. Currently supports Hetzner Cloud, with more cloud providers coming soon. Similar to terraform-hcloud-talos but built entirely in Rust without Terraform dependencies.

Warning

This project is under active development and is considered experimental. Features may change, and not all functionality is production-ready yet. If you encounter bugs or have feature requests, please open an issue on GitHub.

Features

Automated Cluster Deployment: Create production-ready Kubernetes clusters on Hetzner Cloud
Talos Linux: Immutable, minimal, and secure Kubernetes operating system
Cilium CNI: High-performance networking with eBPF
Web Dashboard: Modern web UI for cluster management, monitoring, and operations
LoadBalancer Support: Cilium Node IPAM for LoadBalancer services using node IPs
Prometheus Monitoring: Built-in support for Prometheus stack (Prometheus, Grafana, AlertManager)
Metrics Server: Kubernetes resource metrics for HPA and kubectl top commands
Cluster Autoscaler: Automatic worker node scaling based on pod resource demands (official Kubernetes autoscaler with Hetzner support)
Private Networking: Automatic setup of Hetzner Cloud private networks
Security First:
- Firewall with Talos/Kubernetes API ports pre-configured
- IP allowlisting (restricts access to your IP only)
Flexible Configuration: YAML-based cluster configuration
Multiple Node Types: Support for control plane and worker nodes with different specifications
Health Checks: Built-in validation and cluster readiness checks

Prerequisites

Before using this tool, you need to install the following CLI tools:

talosctl - Talos Linux CLI tool (installation guide)
kubectl - Kubernetes CLI tool (installation guide)
helm - Kubernetes package manager (installation guide)

Installation

Pre-built Binaries

Download the latest release from the GitHub Releases page:

# Replace PLATFORM with: linux-x86_64, linux-aarch64, macos-x86_64, or macos-aarch64
curl -LO https://github.com/dihmeetree/oxide/releases/latest/download/oxide-PLATFORM.tar.gz
tar xzf oxide-PLATFORM.tar.gz
sudo mv oxide /usr/local/bin/

From Source

git clone https://github.com/dihmeetree/oxide
cd oxide
cargo build --release
cargo install --path .

The binary will be available as oxide.

Quick Start

1. Create Talos Snapshot

Before deploying clusters, you need to create a Hetzner Cloud snapshot containing the Talos image. Choose one of the following methods:

Note: Check the latest Talos version at https://github.com/siderolabs/talos/releases and update the version in the commands below accordingly.

Method 1: Rescue Mode (Recommended)

# 1. Create a temporary server
hcloud server create --type cx11 --name talos-snapshot --image ubuntu-22.04 --location nbg1

# 2. Enable rescue mode and reboot
hcloud server enable-rescue talos-snapshot
hcloud server reboot talos-snapshot

# 3. SSH into the rescue system
ssh root@<server-ip>

# 4. Download and write the Talos image
cd /tmp
wget -O /tmp/talos.raw.xz https://factory.talos.dev/image/376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba/v1.11.0/hcloud-amd64.raw.xz
xz -d -c /tmp/talos.raw.xz | dd of=/dev/sda && sync

# 5. Shutdown the server
shutdown -h now

# 6. Wait a moment, then create snapshot from Hetzner Console or CLI
hcloud server create-image --type snapshot --description "Talos v1.11.0" talos-snapshot

# 7. Note the snapshot ID (you'll need this for configuration)
hcloud image list

# 8. Delete the temporary server
hcloud server delete talos-snapshot

Method 2: Using Packer

For automated image creation, use HashiCorp Packer:

# See the official Talos documentation for Packer configuration:
# https://www.talos.dev/v1.11/talos-guides/install/cloud-platforms/hetzner/

# Example: Use the terraform-hcloud-talos/_packer directory in this repo
cd terraform-hcloud-talos/_packer
export HCLOUD_TOKEN=your-token-here
packer init .
packer build .

2. Generate Configuration

Create an example configuration file:

oxide init

This creates a cluster.yaml file with default settings that you can customize.

3. Configure Your Cluster

Edit the cluster.yaml file to match your requirements:

cluster_name: my-talos-cluster

hcloud:
  # Get your token from https://console.hetzner.cloud/
  # Or set HCLOUD_TOKEN environment variable
  location: nbg1
  network:
    cidr: 10.0.0.0/16
    subnet_cidr: 10.0.1.0/24
    zone: eu-central

talos:
  version: v1.11.3
  kubernetes_version: 1.34.1
  hcloud_snapshot_id: "123456789" # Your snapshot ID from step 1

cilium:
  version: 1.17.8
  enable_hubble: true
  enable_ipv6: false

prometheus:
  version: 77.13.0
  enabled: true
  namespace: monitoring
  enable_grafana: true
  enable_alertmanager: true
  retention: 30d
  storage_size: 50Gi
  enable_persistent_storage: false

metrics_server:
  enabled: true

control_planes:
  - name: control-plane
    server_type: cpx21 # 3 vCPUs, 4GB RAM
    count: 3

workers:
  - name: worker
    server_type: cpx31 # 4 vCPUs, 8GB RAM
    count: 3

4. Set API Token

export HCLOUD_TOKEN=your-hetzner-cloud-api-token

5. Create Cluster

oxide create

This will:

Detect your public IP and create firewall rules
Create a private network
Provision control plane and worker servers with firewall applied
Generate and apply Talos configurations
Bootstrap the Kubernetes cluster
Install Cilium CNI
Generate kubeconfig file
Install optional components (Metrics Server, Prometheus, Autoscaler) based on configuration

Security Notes:

Firewall restricts Talos and Kubernetes API access to your current IP address only
All inter-cluster communication uses private network
Talos provides secure API-only access (no SSH)

The process typically takes 5-10 minutes.

6. Access Your Cluster

export KUBECONFIG=./output/kubeconfig
kubectl get nodes

Web Dashboard

Oxide includes a comprehensive web dashboard for managing and monitoring your Kubernetes clusters through a modern, responsive UI:

# Start the dashboard server (default port: 3000)
oxide dashboard

# Use a custom port
oxide dashboard --port 8080

# Use a custom configuration file
oxide --config my-cluster.yaml dashboard

Once started, open your browser to http://localhost:3000 to access the dashboard.

Dashboard Features

Cluster Management

Home Page: Overview with total clusters, nodes, system status, and alert counts
Cluster List: View all clusters with their status, node count, and Talos version
Cluster Details: Detailed view with nodes, control plane/worker counts, and metrics charts
Create Cluster: Web form to deploy new clusters without CLI
Cluster Operations:
- Scale worker or control plane nodes
- Upgrade Talos version
- Delete clusters with confirmation

Node Management

Nodes List: View all nodes across clusters with CPU/Memory usage and status
Node Details:
- Resource metrics (CPU/Memory usage with historical charts)
- All pods running on the node
- Node specifications and role
- Real-time status monitoring

Pod Management

Pods List: View all pods across the cluster with filtering and sorting
- Sort by CPU usage (highest to lowest)
- Status indicators (Running, Pending, Failed)
- Resource usage metrics
Pod Details:
- CPU and Memory usage with percentage and limits
- Container information (image, resources, restart counts)
- Pod labels and configuration
- Status and restart history
Pod Logs:
- Real-time log viewing with syntax highlighting
- Log level detection (Error, Warning, Info, Debug)
- Container selection for multi-container pods
- Configurable tail lines

Service Management

Services List: View all Kubernetes services
- Service types (ClusterIP, NodePort, LoadBalancer)
- Port configurations
- Endpoint counts
Service Details:
- Cluster and External IPs
- Port mappings with protocols
- Selector labels and session affinity
- Active endpoints

Deployment Management

Deployments List: View all deployments with replica status
- Available, progressing, and unavailable counts
- Update strategy information
Deployment Details:
- Replica status and scaling information
- Update strategy (RollingUpdate, Recreate)
- Deployment conditions
- Pod list with status
- Labels and selectors

Event Monitoring

Events Page: Real-time Kubernetes events
- Warning and Normal event counts
- Event timeline with filtering
- Object details (Pod, Node, Service, etc.)
- Event messages and reasons
- Occurrence counts and timestamps

Insights & Best Practices

Production-ready cluster insights with actionable recommendations:

Resource Management:

Pods without resource limits or requests
Over-provisioned pods (using <20% of requests)
Under-provisioned pods (using >90% of limits)
Pods with high restart counts

Reliability:

Deployments with single replicas (no HA)
Pods missing liveness or readiness probes
Frequent pod restarts indicating instability

Security:

Pods running as root user
Privileged containers
Containers using hostPath volumes
Pods using 'latest' image tags

Configuration:

Services without endpoints (no backing pods)
Namespaces without resource quotas

Each insight includes:

Severity level (High, Medium, Low)
Clear description of the issue
Actionable recommendations
List of affected resources
Category classification

CNI Monitoring

Cilium Page:

Cilium version and configuration status
Hubble and IPv6 enablement status
Per-pod CPU and Memory usage charts
Cilium pod details with resource metrics
Historical performance data

Envoy Page (if using Envoy with Cilium):

Envoy version and pod status
CPU and Memory usage per pod
RPS (Requests Per Second) metrics
HTTP status code distribution (2xx, 3xx, 4xx, 5xx)
Historical metrics and trends

Metrics & Monitoring

Prometheus Integration: Real-time metrics collection and visualization
Historical Charts: CPU and Memory usage over time
Multi-Pod Views: Compare metrics across pods
Interactive Legends: Click to show/hide specific pods
Automatic Refresh: Data updates every 30 seconds
Alert Integration: Shows firing alerts count in navigation

User Interface

Modern Design: Clean, responsive interface with dark mode
Fast Navigation: Quick access to all resources
Search & Filter: Find resources quickly
Status Indicators: Color-coded status badges
Real-time Updates: Automatic data refresh
Responsive Layout: Works on desktop, tablet, and mobile
Breadcrumb Navigation: Easy navigation through resource hierarchy

Commands

Create a Cluster

# Using default cluster.yaml
oxide create

# Using a custom configuration file
oxide --config my-cluster.yaml create

Show Cluster Status

# Using default cluster.yaml
oxide status

# Using a custom configuration file
oxide --config my-cluster.yaml status

Shows information about all servers organized by node pools, including current node counts and server specifications.

Scale Cluster Nodes

Scale the number of nodes in your cluster up or down:

# Scale workers to 5 nodes (uses first worker pool by default)
oxide scale worker --count 5

# Scale control plane nodes to 3
oxide scale control-plane --count 3

# Scale a specific node pool
oxide scale worker --count 10 --pool worker-large

Scaling Behavior:

Scale Up: Creates new nodes with the same configuration as the existing pool, automatically configures them with Talos, and applies firewall rules
Scale Down: Removes the newest nodes first (highest index numbers)
Pool-specific: Can target specific node pools if you have multiple worker or control plane pools configured

Example Use Cases:

# Increase workers for higher workload
oxide scale worker --count 10

# Scale down to save costs during low-usage periods
oxide scale worker --count 2

# Add more control plane nodes for HA
oxide scale control-plane --count 3

Important Notes:

Scaling is idempotent - if already at target count, no changes are made
New nodes are automatically joined to the cluster
When scaling down, ensure your workloads can handle node removals
Control plane scaling: maintaining odd numbers (1, 3, 5) is recommended for etcd quorum

Destroy a Cluster

# Using default cluster.yaml
oxide destroy

# Using a custom configuration file
oxide --config my-cluster.yaml destroy

Warning: This permanently deletes all servers, networks, and SSH keys.

Upgrade Cluster

Upgrade your cluster nodes to a new Talos version:

# Upgrade control plane nodes only
oxide upgrade --version v1.11.3 --control-plane

# Upgrade worker nodes only
oxide upgrade --version v1.11.3 --workers

# Upgrade both control plane and worker nodes
oxide upgrade --version v1.11.3 --control-plane --workers

# Upgrade without preserving node data (default is to preserve)
oxide upgrade --version v1.11.3 --control-plane --workers --preserve false

# Wait and observe each node upgrade (shows live progress)
oxide upgrade --version v1.11.3 --control-plane --wait

# Stage the upgrade (applies on next reboot, useful if upgrade fails due to open files)
oxide upgrade --version v1.11.3 --workers --stage

Upgrade Behavior:

Sequential Upgrade: Nodes are upgraded one at a time to maintain cluster availability
Automatic Image Selection: Installer image is automatically constructed from version (e.g., ghcr.io/siderolabs/installer:v1.11.3)
Data Preservation: By default, node data is preserved during upgrade (--preserve true)
Granular Control: Can upgrade control plane and workers independently
Progress Logging: Shows detailed progress for each node upgrade
etcd Quorum Protection: Talos automatically refuses control plane upgrades that would break etcd quorum

Upgrade Options:

--version: Talos version to upgrade to (required)
--control-plane: Upgrade control plane nodes
--workers: Upgrade worker nodes
--preserve: Preserve node data (default: true)
--wait: Wait and observe the upgrade process for each node (shows live output)
--stage: Stage the upgrade to apply on next reboot (useful if upgrade fails due to open files)

Example Upgrade Workflow:

# 1. Upgrade control plane nodes first
oxide upgrade --version v1.11.3 --control-plane

# 2. Wait for control plane to stabilize
kubectl get nodes

# 3. Upgrade worker nodes
oxide upgrade --version v1.11.3 --workers

Important Notes:

At least one of --control-plane or --workers must be specified
Upgrade Path: Always upgrade through adjacent minor releases sequentially (e.g., 1.10 → 1.11 → 1.12)
Control Plane First: Recommended to upgrade control plane nodes before worker nodes
One at a Time: Nodes are upgraded sequentially to maintain cluster availability - avoid upgrading all nodes simultaneously
Kubernetes Version: Talos upgrade does NOT automatically upgrade Kubernetes version
Automatic Rollback: If new version fails to boot, Talos will automatically rollback
Version Compatibility: Check Talos upgrade documentation for version compatibility

Install Prometheus Monitoring

Install the Prometheus monitoring stack (Prometheus, Grafana, AlertManager):

oxide install-prometheus

This installs the kube-prometheus-stack Helm chart with:

Prometheus server with persistent storage
Grafana dashboards (default login: admin/admin)
AlertManager for notifications
Service monitors for Cilium and Kubernetes components

Show Prometheus Status

oxide prometheus-status

Shows the status of all Prometheus components and provides Grafana access instructions.

Access Grafana Dashboard

To access Grafana locally, use port-forwarding:

kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80 --kubeconfig=./output/kubeconfig

Then open http://localhost:3000 in your browser:

Username: admin
Password: admin (change after first login)

Access Prometheus UI

To access Prometheus UI locally:

kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 --kubeconfig=./output/kubeconfig

Then open http://localhost:9090 in your browser.

Access AlertManager UI

To access AlertManager UI locally:

kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-alertmanager 9093:9093 --kubeconfig=./output/kubeconfig

Then open http://localhost:9093 in your browser.

Uninstall Prometheus

oxide uninstall-prometheus

Install Cluster Autoscaler

Install the Kubernetes Cluster Autoscaler with Hetzner support to automatically scale worker nodes based on pod resource requests:

oxide install-autoscaler

This installs the official Kubernetes Cluster Autoscaler configured for Hetzner Cloud provider. The autoscaler will:

Automatically add worker nodes when pods cannot be scheduled due to insufficient resources
Remove underutilized worker nodes to save costs
Respect min/max node limits configured per worker pool

Configuration Example:

autoscaler:
  enabled: true
  worker_pools:
    - name: worker-pool
      server_type: cpx11 # Hetzner server type
      location: fsn1 # Hetzner location
      min_nodes: 1
      max_nodes: 10

Monitor Autoscaler Logs:

kubectl logs -n oxide-system -l app=cluster-autoscaler -f --kubeconfig=./output/kubeconfig

Important Notes:

The autoscaler only scales worker nodes, not control plane nodes
Scaling decisions are based on pod resource requests (CPU/memory), not actual usage
Nodes are created with the same Talos configuration as your initial worker nodes
The autoscaler respects PodDisruptionBudgets when scaling down

Uninstall Cluster Autoscaler

oxide uninstall-autoscaler

Install Metrics Server

Install the Kubernetes Metrics Server for resource metrics and HPA support:

oxide install-metrics-server

The Metrics Server enables:

kubectl top nodes and kubectl top pods commands
HorizontalPodAutoscaler (HPA) to scale pods based on CPU/memory usage
Resource-based autoscaling decisions

Verify Installation:

kubectl top nodes --kubeconfig=./output/kubeconfig

Note: Metrics Server is automatically installed during cluster creation if enabled in the configuration.

Uninstall Metrics Server

oxide uninstall-metrics-server

Configuration Reference

Cluster Configuration

Field	Description	Required
`cluster_name`	Unique name for your cluster	Yes
`hcloud`	Hetzner Cloud settings	Yes
`talos`	Talos Linux configuration	Yes
`cilium`	Cilium CNI settings	Yes
`prometheus`	Prometheus monitoring settings	No
`metrics_server`	Metrics Server settings	No
`autoscaler`	Cluster autoscaler settings	No
`control_planes`	Control plane node specs	Yes
`workers`	Worker node specs	No

Hetzner Cloud Settings

Field	Description	Default
`token`	API token (or use `HCLOUD_TOKEN` env var)	-
`location`	Data center location (nbg1, fsn1, hel1, etc.)	nbg1
`network.cidr`	Private network CIDR	10.0.0.0/16
`network.subnet_cidr`	Subnet CIDR	10.0.1.0/24
`network.zone`	Network zone	eu-central

Prometheus Configuration

Field	Description	Default
`version`	kube-prometheus-stack chart version	77.13.0
`enabled`	Enable Prometheus installation	true
`namespace`	Kubernetes namespace for Prometheus	monitoring
`enable_grafana`	Enable Grafana dashboards	true
`enable_alertmanager`	Enable AlertManager	true
`retention`	Prometheus data retention period	30d
`storage_size`	Prometheus persistent storage size	50Gi
`enable_persistent_storage`	Enable persistent storage for Prometheus	false

Metrics Server Configuration

Field	Description	Default
`enabled`	Enable metrics server	true

Note: Metrics Server is automatically installed during cluster creation when enabled.

Cluster Autoscaler Configuration

Field	Description	Required
`enabled`	Enable cluster autoscaler	Yes
`worker_pools`	List of worker pools to autoscale	Yes

Worker Pool Configuration:

Field	Description	Required	Default
`name`	Worker pool name	Yes	-
`server_type`	Hetzner server type (cpx11, cpx21...)	Yes	-
`location`	Hetzner location (fsn1, nbg1...)	Yes	-
`min_nodes`	Minimum autoscaled nodes (set to 0 to preserve initial worker nodes)	No	0
`max_nodes`	Maximum autoscaled nodes	Yes	-

Important: Set min_nodes: 0 to ensure the autoscaler only manages nodes it creates dynamically, leaving your initial worker nodes (defined in workers.count) untouched. This way:

Your base worker nodes always remain in the cluster

The autoscaler only creates/deletes additional nodes above this baseline

Pods will be consolidated back to original nodes when autoscaled nodes are no longer needed

Node Configuration

Field	Description	Default
`name`	Node name prefix	-
`server_type`	Hetzner server type (cx21, cpx31, etc.)	-
`count`	Number of nodes to create	1
`labels`	Additional Kubernetes labels	{}

Hetzner Server Types (Common Options)

Shared vCPU (AMD EPYC):

Type	vCPUs	RAM	Storage	Price/Month
cpx11	2	2GB	40GB	~€4.49
cpx21	3	4GB	80GB	~€8.99
cpx31	4	8GB	160GB	~€15.99
cpx41	8	16GB	240GB	~€29.99
cpx51	16	32GB	360GB	~€59.99

Dedicated vCPU (AMD EPYC):

Type	vCPUs	RAM	Storage	Price/Month
ccx13	2	8GB	80GB	~€12.99
ccx23	4	16GB	160GB	~€25.99
ccx33	8	32GB	240GB	~€49.99
ccx43	16	64GB	360GB	~€99.99
ccx53	32	128GB	600GB	~€199.99
ccx63	48	192GB	960GB	~€299.99

See Hetzner Cloud pricing for all available types.

Architecture

The tool creates:

Firewall: Hetzner Cloud firewall with restricted access to Talos and Kubernetes APIs
Private Network: A Hetzner Cloud private network for inter-node communication
Control Plane Nodes: Run the Kubernetes control plane (etcd, API server, scheduler, controller manager)
Worker Nodes: Run your application workloads
Cilium: Provides networking, load balancing, and network policies

Network Architecture

           Your IP (Firewall Allowed)
                    ↓
┌──────────────────────────────────────────────┐
│        Hetzner Cloud Firewall                │
│  - Talos API (50000): Your IP only           │
│  - Kubernetes API (6443): Your IP only       │
│  - HTTP (80): Public access                  │
│  - HTTPS (443): Public access                │
└──────────────────────────────────────────────┘
                    ↓
┌──────────────────────────────────────────────┐
│      Hetzner Cloud Private Network           │
│             10.0.0.0/16                      │
│         Node Subnet: 10.0.1.0/24             │
│         Pod CIDR: 10.0.16.0/20               │
│         Service CIDR: 10.0.8.0/21            │
│                                              │
│  ┌────────────┐  ┌────────────┐              │
│  │ Control    │  │ Control    │              │
│  │ Plane 1    │  │ Plane 2    │  ...         │
│  └────────────┘  └────────────┘              │
│                                              │
│  ┌────────────┐  ┌────────────┐              │
│  │ Worker 1   │  │ Worker 2   │  ...         │
│  └────────────┘  └────────────┘              │
└──────────────────────────────────────────────┘

Firewall Rules

The automatically configured firewall includes:

Port	Protocol	Source	Purpose
6443	TCP	Your IP	Kubernetes API
50000	TCP	Your IP	Talos API
80	TCP	0.0.0.0/0	HTTP Traffic
443	TCP	0.0.0.0/0	HTTPS Traffic

Note: Internal cluster communication on the private network (10.0.0.0/16) is not restricted by Hetzner Cloud firewalls.

Output Files

After cluster creation, the following files are generated in the output/ directory:

controlplane.yaml - Talos configuration for control plane nodes
worker.yaml - Talos configuration for worker nodes
talosconfig - Talos client configuration
kubeconfig - Kubernetes client configuration
secrets.yaml - Talos secrets (keep secure!)

Important: The secrets.yaml file contains sensitive information. Keep it secure and never commit to version control.

Troubleshooting

Cluster Creation Fails

Check API token: Ensure HCLOUD_TOKEN is set correctly
Verify prerequisites: Make sure talosctl, kubectl, and helm are installed
Check logs: Run with --verbose flag for detailed output
Resource limits: Verify your Hetzner account has sufficient resources

Nodes Not Ready

# Check Talos node status
talosctl --talosconfig ./output/talosconfig --nodes <node-ip> health

# Check Kubernetes pods
kubectl get pods -A

Cilium Issues

# Check Cilium status
kubectl get pods -n kube-system -l k8s-app=cilium

# View Cilium logs
kubectl logs -n kube-system -l k8s-app=cilium

Cost Estimation

Example monthly costs for a 3 control plane + 3 worker cluster:

Control Planes (3x cpx21): ~€27/month (3 × €8.99)
Workers (3x cpx31): ~€48/month (3 × €15.99)
IPv4 Addresses (6 servers): ~€3/month (6 × €0.50)
Network: Free
Snapshot: ~€0.50/month
Traffic: 1-5TB free per server (depending on type)

Total: ~€78.50/month

Costs are approximate. See Hetzner pricing for exact rates.

Comparison with Terraform

Advantages of This Tool

Single Binary: No Terraform or provider management
Type Safety: Rust's type system catches errors at compile time
Performance: Fast Rust implementation
Native Integration: Direct API calls, no intermediate layers

When to Use Terraform

You need to manage other infrastructure beyond Hetzner
Your team has existing Terraform expertise
You require Terraform's extensive module ecosystem

Development

Building

cargo build

Running Tests

cargo test --release

Code Quality

cargo clippy -- -D warnings
cargo fmt

Contributing

Contributions are welcome! Please ensure your code:

Compiles without warnings
Passes all tests
Follows Rust formatting conventions
Includes documentation for public APIs

License

[Add your license here]

Acknowledgments

Talos Linux - Secure Kubernetes OS
Cilium - eBPF-based networking
Hetzner Cloud - Affordable cloud hosting
terraform-hcloud-talos - Inspiration for this project

Security

Never commit your HCLOUD_TOKEN or API credentials
Store kubeconfig files securely
Use private networks for inter-node communication
Enable Cilium network policies for pod-to-pod security
Regularly update Talos and Kubernetes versions

Support

For issues and questions:

Check the Troubleshooting section
Review Talos documentation
Check Cilium documentation
Open an issue on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
.github/workflows		.github/workflows
docs		docs
manifests		manifests
patches		patches
src		src
static		static
templates		templates
terraform-hcloud-talos @ 3d56e9d		terraform-hcloud-talos @ 3d56e9d
.gitignore		.gitignore
.gitmodules		.gitmodules
CLAUDE.md		CLAUDE.md
Cargo.toml		Cargo.toml
README.md		README.md
cluster.example.yaml		cluster.example.yaml

dihmeetree/oxide

Folders and files

Latest commit

History

Repository files navigation

Features

Prerequisites

Installation

Pre-built Binaries

From Source

Quick Start

1. Create Talos Snapshot

Method 1: Rescue Mode (Recommended)

Method 2: Using Packer

2. Generate Configuration

3. Configure Your Cluster

4. Set API Token

5. Create Cluster

6. Access Your Cluster

Web Dashboard

Dashboard Features

Cluster Management

Node Management

Pod Management

Service Management

Deployment Management

Event Monitoring

Insights & Best Practices

CNI Monitoring

Metrics & Monitoring

User Interface

Commands

Create a Cluster

Show Cluster Status

Scale Cluster Nodes

Destroy a Cluster

Upgrade Cluster

Install Prometheus Monitoring

Show Prometheus Status

Access Grafana Dashboard

Access Prometheus UI

Access AlertManager UI

Uninstall Prometheus

Install Cluster Autoscaler

Uninstall Cluster Autoscaler

Install Metrics Server

Uninstall Metrics Server

Configuration Reference

Cluster Configuration

Hetzner Cloud Settings

Prometheus Configuration

Metrics Server Configuration

Cluster Autoscaler Configuration

Node Configuration

Hetzner Server Types (Common Options)

Architecture

Network Architecture

Firewall Rules

Output Files

Troubleshooting

Cluster Creation Fails

Nodes Not Ready

Cilium Issues

Cost Estimation

Comparison with Terraform

Advantages of This Tool

When to Use Terraform

Development

Building

Running Tests

Code Quality

Contributing

License

Acknowledgments

Security

Support

About

Resources

Uh oh!

Stars

Packages