Using the files in this repository it is possible to easily setup a multi-node Kubernetes cluster for development/testing/learning purposes on a single Linux machine.
They have been tested with Fedora 42 and Ubuntu 24.04, but should work also on other recent Linux distributions.
- Requirements
- Configuration
- Cluster creation
- Test deployment
- Add/remove worker nodes
- Nodes configuration updates
- Control the cluster from the host machine
- Deploy and access the Dashboard
- Destroy the cluster
- References
On the host Linux machine the following software must be installed. The required commands are split by distribution because the package names and preferred tooling can differ.
-
libvirt library: install the mandatory, default and optional virtualization packages with the command below (see https://docs.fedoraproject.org/en-US/quick-docs/getting-started-with-virtualization/ for more details).
sudo dnf group install --with-optional virtualization
-
Terraform: install it following HashiCorp's Fedora guidance (see https://developer.hashicorp.com/terraform/downloads?product_intent=terraform).
-
Butane: install the native binary so Terraform can convert
.bufiles locally:sudo dnf install -y butane
-
libvirt library: use Ubuntu's packages for QEMU/libvirt and helpers, then enable the daemon.
sudo apt update sudo apt install -y qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virt-manager sudo systemctl enable --now libvirtd sudo apt install -y curl gnupg software-properties-common -
Terraform: install it following HashiCorp's Ubuntu guidance (see https://developer.hashicorp.com/terraform/downloads?product_intent=terraform).
-
Butane: for Ubuntu, the preferred workflow is to run the official container image via Docker so you avoid host dependencies. Make sure Docker is installed as described at https://docs.docker.com/engine/install/.
The repository's
k8s.tfstill prefers a nativebutanebinary but will fall back to the Docker image when needed; the containerized approach avoids dependency issues on Ubuntu.
With the above software installed, clone this repository:
git clone https://github.com/luigiqtt/dev-multinode-k8s.gitThis will create a folder named dev-multinode-k8s containing the following files:
- k8s.tf: is the Terraform file describing the K8s cluster;
- variables.tf: contains the declaration of the variables used;
- k8s.auto.tfvars: contains the values of the variables used to create the cluster (see the Configuration chapter);
- k8s.secret.auto.tfvars: contains the password that will be used to access the nodes of the cluster via SSH (see the Configuration chapter). Note that this file is included only as an example, but, since it contains potentially sensitive information, it should not be stored in version control systems;
- Butane (.bu) files in the config folder (see the Configuration chapter);
- create.sh: simple example script to execute all the steps required for the creation of the cluster (see the Cluster creation chapter).
Another prerequisite is an extracted qcow2 image of Fedora CoreOS, that can be downloaded from the Fedora CoreOS official website. Download the QEMU version and uncompress the .xz file. The uncompressed file must be put in the images directory that is present in the cloned repository.
Tip: before creating a new cluster, download the latest version of Fedora CoreOS. In fact, when Fedora CoreOS starts up searches for updates and, if some are available, tries to install them. This could cause problems in the cluster creation process described below.
The files in the repository can be directly used by only updating the name of the downloaded Fedora CoreOS image in the k8s.auto.tfvars file. The created cluster will have one Control Plane node and two worker nodes. The Container Runtime Interface (CRI) implementation is cri-o while the Container Network Interface (CNI) provider is Kube-Router.
If you want to modify the example configuration you can act on the Terraform variables and/or the Butane files.
Most of the configuration parameters are contained in the k8s.auto.tfvars and k8s.secret.auto.tfvars files. Modify them according to your requirements (see the variables.tf file for a description of the declared variables).
The k8s.secret.auto.tfvars contains only the password of the admin user of all the nodes. It is used by Terraform to install the required software on the nodes and must match the password_hash configured in the Butane files of the nodes (see below).
If you want to change the number of worker nodes of the cluster see the Add/remove worker nodes chapter.
The configuration of the virtual machines (nodes of the cluster) that is applied at the first boot is defined using Butane YAML-formatted files. For each VM there must be a Butane file in the config directory. Each file will be automatically converted by Terraform into an Ignition JSON file (.ign).
Important: Ignition is applied by Fedora CoreOS at first boot. Updating a .bu file and regenerating the .ign file does not change the configuration of an already-running VM. To apply Butane/Ignition changes you typically need to recreate the affected node, so it's probably better to plan ahead and get the configuration right before creating the cluster.
The main parameters in the Butane files that can be safely changed before applying the configuration with Terraform are the following:
-
hostname of the node: if desired, change the content of the file /etc/hostname;
-
password_hash: hash of the password that can be used to access the cluster nodes via SSH. The content of this field can be computed using the following command (the password must be the one configured in the k8s.secret.auto.tfvars file):
mkpasswd --method=yescrypt
-
ssh_authorized_keys (optional): list of the SSH keys that can be used to access the cluster nodes via SSH. This parameter is optional, but it is advisable to set it in order to speed up the access to the nodes of the cluster (see: https://www.ssh.com/academy/ssh/authorized-key);
-
podSubnet (only in the clusterconfig.yml file definition present in the control-plane.bu file): subnet used by pods. Modify it if another set of IP addresses is desired for the pods.
Of course, you can change also the rest of the files in order, for example, to add other useful files/scripts to the nodes, configure services, modify the initialization scripts, etc..
To create a new multi-node K8s cluster on your Linux machine the required steps are the following:
-
Initialize Terraform:
cd dev-multinode-k8s terraform init -
Modify the configuration files if needed (see the previous chapter);
-
Apply the configuration with Terraform:
terraform apply
If you want to see what actions Terraform would take to apply the current configuration before applying it, you can do this with the following command:
terraform plan
-
Wait some seconds after the end of the apply command execution in order to give the virtual machines the time to restart;
-
Initialize the Control Plane node: log in to the Control Plane node as the admin user (default IP: 192.168.40.162) and execute the script setup/init.sh:
ssh admin@192.168.40.162 cd setup ./init.sh -
Copy the join command from the script execution log on the Control Plane node. The following is an example of join command:
kubeadm join 192.168.40.162:6443 --token dixcvq.c5l1ogpz2ttymfs2 \ --discovery-token-ca-cert-hash sha256:ae1f5bdd5b8521f8ee842d3efc6796a83c755276d88dd340ab5f8f98a41fb968 -
Initialize the worker nodes and add them to the cluster: log in to each worker node as the admin user and execute the script setup/init.sh:
ssh admin@192.168.40.xxx cd setup ./init.shThen execute with sudo the join command copied at the previous step to add the worker to the cluster. For example:
sudo kubeadm join 192.168.40.162:6443 --token dixcvq.c5l1ogpz2ttymfs2 \ --discovery-token-ca-cert-hash sha256:ae1f5bdd5b8521f8ee842d3efc6796a83c755276d88dd340ab5f8f98a41fb968 -
That's it! Your multi-node K8s cluster is up and running! You can check the list of the nodes logging in to the Control Plane node and executing the command:
kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-control-plane Ready control-plane 2m23s v1.34.3 k8s-worker0 Ready <none> 112s v1.34.3 k8s-worker1 Ready <none> 81s v1.34.3 (or kubectl get nodes -o wide to get more details)
If you have installed the Virtual Machine Manager on your host machine, you can see the running nodes (virtual machines) of your new cluster:
Alternatively you can use the virsh command. For example, the command:
virsh list
prints the information about the existing virtual machines (referred to as domains in libvirt):
Id Name State ---------------------------------- 1 k8s-control-plane running 2 k8s-worker0 running 3 k8s-worker1 running
Note 1: if necessary, the join command required to add a node to the cluster can be obtained on the Control Plane node using the following command:
kubeadm token create --print-join-commandNote 2: the files in the repository include a simple example script named create.sh that executes all the steps described above automatically (except for the Terraform initialization command). Note that, to work correctly, such script requires that the latest version available of Fedora Core OS is used for the deploy. Moreover, if you've not set up SSH keys for the admin user in the Butane files, the script will prompt you for the password of the admin user several times.
Note 3: if the DNS doesn't work correctly, try to restart all the nodes of the cluster.
Note 4: if the created virtual machines do not have Internet access, here’s a checklist to troubleshoot common issues related to libvirt networking and firewall settings on the host machine:
-
Enable IP Forwarding: check if IP forwarding is enabled:
sysctl net.ipv4.ip_forward
If it returns
0, enable it temporarily:sysctl -w net.ipv4.ip_forward=1
To make it permanent, edit
/etc/sysctl.confand add:net.ipv4.ip_forward = 1Then apply the changes:
sudo sysctl -p
-
Add NAT rules:
Allow forwarding in the libvirt zone:
sudo firewall-cmd --zone=libvirt --add-forward --permanent
Allow traffic from the custom k8snet network (defined in the k8s.auto.tfvars file) to be forwarded through the zone:
sudo firewall-cmd --zone=libvirt --add-source=192.168.40.160/27 --permanent
Ensure that NAT (masquerading) is enabled on the outgoing traffic to the Internet:
sudo firewall-cmd --add-masquerade --permanent
Reload firewalld to apply the settings:
sudo firewall-cmd --reload
To ensure everything is correctly configured, you can check the status of your zones and rules:
sudo firewall-cmd --zone=libvirt --list-all sudo firewall-cmd --list-all
Ensure that forwarding is enabled for the libvirt zone and that masquerading is applied to the default zone.
-
Docker Interference:
If Docker is installed, it sets the default
iptablesforwarding policy toDROP, which blocks libvirt traffic. Fix it by running:sudo iptables -P FORWARD ACCEPT
-
DNS Issues:
If
ping 8.8.8.8works butping google.comfails, it's a DNS problem. Check ifsystemd-resolvedis interfering or try forcing a DNS in the Terraform configuration.
Note 5: depending on how libvirt is configured on your host, Terraform may need elevated privileges (e.g., sudo) or your user must be allowed to talk to the system libvirt daemon (qemu:///system). If Terraform fails with libvirt permission/connection errors, check the items below.
-
Ensure your user can access
qemu:///systemTerraform uses the system libvirt connection (
qemu:///system), not the per-user one (qemu:///session). To avoid running Terraform as root, add your account to thelibvirtgroup and refresh your session:sudo usermod -aG libvirt $USER newgrp libvirt # or log out and back in to refresh group membership
-
Ensure QEMU can read the generated
./config/*.ignfilesIf a VM fails to start with errors like
Failed to open file ... Permission deniedwhile reading./config/*.ign, remember that the QEMU process must be able to traverse the full directory path to those files (it needs execute permission (x) on each parent directory).A common pitfall: if your repo lives under
/home/<user>/...and/home/<user>is700, QEMU cannot reach the.ignfiles even when the files themselves are644.To find which parent directory blocks access, use:
namei -l <path-to-ign-file>
A minimal fix is to grant traverse permission via ACL on your home directory (adjust if the blocking directory is deeper than
/home/$USER):sudo setfacl -m u:qemu:--x /home/$USER -
Ensure the
defaultlibvirt storage pool exists (Fedora)On Fedora, also ensure the
defaultlibvirt storage pool exists. Since Terraform uses the system connection, check the pool onqemu:///system:virsh -c qemu:///system pool-list --all
If the
defaultpool is missing, create it with the following commands:sudo virsh pool-define-as --name default --type dir --target /var/lib/libvirt/images sudo virsh pool-build default sudo virsh pool-start default sudo virsh pool-autostart default
(Tip: you can set
export LIBVIRT_DEFAULT_URI=qemu:///systemin your shell to makevirshuse the system connection by default).Running
virsh pool-list --allafterwards should show the pool as active. This step ensures Fedora hosts without a pre-existing default pool still work with the example Terraform configuration.
After the cluster is created and initialized, you can deploy a simple test workload.
Run the following kubectl commands on the Control Plane node (where kubectl is installed and configured by the setup scripts). Alternatively, you can run them from the host machine if you have configured kubectl there (see Control the cluster from the host machine).
-
Verify the nodes are Ready:
kubectl get nodes
-
Create an NGINX deployment (2 replicas):
kubectl create deployment nginx-test --image=nginx:latest --replicas=2 --port=80
-
Expose it via a NodePort Service:
kubectl expose deployment nginx-test --type=NodePort --port=80 kubectl get svc nginx-test
Take note of the allocated port in the
PORT(S)column (e.g.80:3xxxx/TCP).If you prefer, you can extract the NodePort with:
kubectl get svc nginx-test -o jsonpath='{.spec.ports[0].nodePort}' -
Test from the host (or from any machine that can reach the node IPs):
- Pick any node IP (control-plane or worker). With the default example config, worker0 is
192.168.40.163. - Then run:
You should see an HTTP
NODE_IP=192.168.40.163 NODE_PORT=$(kubectl get svc nginx-test -o jsonpath='{.spec.ports[0].nodePort}') curl -I "http://${NODE_IP}:${NODE_PORT}"
200 OKresponse.
Alternatively, open your browser at:
http://192.168.40.163:<NODE_PORT>and you should get the default NGINX welcome page.
- Pick any node IP (control-plane or worker). With the default example config, worker0 is
To clean up the test resources:
kubectl delete svc nginx-test
kubectl delete deployment nginx-testAdding or removing worker nodes is straightforward and can be done by following the steps described below.
Modify the workers_count parameter in the k8s.auto.tfvars file to a smaller value, then run the Terraform apply command:
terraform applyNote that in this way the virtual machines corresponding to the removed nodes will be destroyed and this could have some unwanted impacts on the running pods. See Safely Drain a Node for recommendations on how to properly remove a node from the cluster.
Modify the workers_count parameter in the k8s.auto.tfvars file to a higher value. If the value is greater than 3, you need to add the required nodes configurations in the same file (e.g.: worker3, worker4, etc.) and create the Butane files in the config folder for each worker with index greater than 2. In fact, in the repository there are only 3 Butane files for a maximum of 3 workers. If, for example, you want to create a cluster with 5 worker nodes, you will have to create the files worker3.bu and worker4.bu (note that the names of the files must have the format workerN.bu). Such files can have the same content as the others with only the Hostname changed.
Now you can run the Terraform apply command:
terraform applyWhen the command execution terminates, follow the same steps described in the Cluster creation chapter to initialize and add the new worker nodes. To get the join command required to add the new nodes to the cluster, execute the following command on the Control Plane node:
kubeadm token create --print-join-commandThere are two ways to modify the nodes configuration parameters (e.g.: vcpu number, memory, etc.). The first is using Terraform and the second is using libvirt (using the Virtual Machine Manager or the virsh command).
The preferred method is using Terraform to keep the configuration files and the deployed infrastructure aligned. Simply modify the parameters in the configuration files, then execute the apply command (terraform apply).
However, for some modifications, applying the new configuration with Terraform may replace the existing virtual machines with new ones. This can disrupt your cluster, requiring you to reinitialize one or more nodes. This is, at least in some cases, probably due to a limitation of the current Terraform provider for libvirt that may change in future versions.
Therefore, it is recommended to first check the planned modifications with:
terraform planIf in the plan the number of destroyed resources is 0, it is safe to proceed and apply the new configuration. If such number is > 0, use libvirt.
If you want to control the cluster from the host machine, you should install the command-line tool kubectl and then copy the .kube/config file of the Control Plane node in the .kube directory (create it if not already present) in your home directory. The installation of the kubectl tool can be easily done using snap.
To deploy and access the Kubernetes Dashboard see: https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/.
To destroy the cluster removing all the created virtual machines, execute the following command:
terraform destroy
