cluster_stack is a security engineering platform designed to model, observe, and validate identity, configuration, and workload security controls across a hybrid environment.
This repository documents a deliberately staged build: a stable on‑prem foundation first, followed by controlled introduction of Kubernetes, cloud identity, misconfiguration scenarios, adversary activity, and detection logic.
The emphasis is on engineering correctness, telemetry integrity, and reproducible security failure modes—not dashboards, vendor demos, or SOC simulations.
This environment is intentionally not:
- A production cloud replica
- A SOC‑in‑a‑box
- A vendor feature showcase
It is:
A security engineering environment focused on identity, configuration, and workload security, using local infrastructure, infrastructure‑as‑code patterns, centralized logging, and adversarial testing as proof.
All components serve that goal. Anything that does not reinforce it is deferred or excluded.
- Virtualization layer: Proxmox (bare metal)
- Telemetry spine: Elastic Stack with Fleet‑managed agents
- Baseline workloads: Hardened Ubuntu hosts
- Future workload domain: Isolated local Kubernetes
- Hybrid intent: AWS integration for identity and IaC scenarios (Phase 2+)
Elastic is treated as the authoritative telemetry backbone. All workloads—VMs, containers, and future cloud integrations—are observable through it.
docs/
├── 00_architecture/ # Platform doctrine, threat models, design intent
│
├── phase-1/ # Proven foundations and baselines
│ ├── virtualization-foundation.md
│ ├── elastic-stack-foundation.md
│ └── proxmox-backup-nfs.md
│
├── phase-2/ # Planned expansion (controlled introduction)
│ ├── phase-2-build-plan.md
│ └── kubernetes-security-objectives.md
│
└── reference/ # Canonical operational references
├── kibana-configuration.md
├── elastic-agent-enrollment.md
├── elastic-fleet-control-invariants.md
├── disk-watermarks.md
└── operational-command-reference.md
Each document has a single responsibility:
- Architecture explains why
- Phase docs explain what exists and is proven
- Reference docs explain how things work
There is no duplication by design.
Phase 1 establishes a stable, verifiable foundation:
- Proxmox installed on bare metal
- Ubuntu gold image created and sanitized
- Elastic Stack deployed with TLS and ILM
- Fleet Server operational
- Elastic Agents enrolled and reporting
- Osquery telemetry validated end‑to‑end
- Snapshot and backup strategy enforced
Phase 1 is considered complete and stable. No additional capabilities are layered onto the platform until the telemetry and control plane are proven reliable.
Phase 2 introduces controlled complexity, not scale:
- Local Kubernetes as a workload domain
- Kubernetes RBAC and service account abuse scenarios
- Container hardening and runtime security
- Identity and IaC misconfiguration modeling
- Hybrid AWS integration for CIEM/CSPM scenarios
All Phase 2 work is gated on:
- Clean isolation from baseline hosts
- Confirmed telemetry ingestion
- Reproducible evidence for each scenario
For every capability added, the platform produces:
- Build steps — commands and configuration
- Intentional misconfiguration — what is broken
- Evidence — logs, queries, artifacts
- Remediation — what fixes it and why
If a change cannot be observed, measured, or reversed, it does not belong in the platform.
This repository is designed for:
- Security engineers
- Platform and cloud engineers
- Detection and identity specialists
- Interviewers evaluating hands‑on security engineering depth
It assumes familiarity with Linux, virtualization, and modern security tooling.