Skip to content

luigiqtt/dev-multinode-k8s

Repository files navigation

Multi-node K8s cluster on a single Linux machine with Terraform, libvirt and Fedora CoreOS

Using the files in this repository it is possible to easily setup a multi-node Kubernetes cluster for development/testing/learning purposes on a single Linux machine.

They have been tested with Fedora 42 and Ubuntu 24.04, but should work also on other recent Linux distributions.

Requirements

On the host Linux machine the following software must be installed. The required commands are split by distribution because the package names and preferred tooling can differ.

Fedora

Ubuntu

  • libvirt library: use Ubuntu's packages for QEMU/libvirt and helpers, then enable the daemon.

    sudo apt update
    sudo apt install -y qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virt-manager
    sudo systemctl enable --now libvirtd
    sudo apt install -y curl gnupg software-properties-common
  • Terraform: install it following HashiCorp's Ubuntu guidance (see https://developer.hashicorp.com/terraform/downloads?product_intent=terraform).

  • Butane: for Ubuntu, the preferred workflow is to run the official container image via Docker so you avoid host dependencies. Make sure Docker is installed as described at https://docs.docker.com/engine/install/.

    The repository's k8s.tf still prefers a native butane binary but will fall back to the Docker image when needed; the containerized approach avoids dependency issues on Ubuntu.

With the above software installed, clone this repository:

git clone https://github.com/luigiqtt/dev-multinode-k8s.git

This will create a folder named dev-multinode-k8s containing the following files:

  • k8s.tf: is the Terraform file describing the K8s cluster;
  • variables.tf: contains the declaration of the variables used;
  • k8s.auto.tfvars: contains the values of the variables used to create the cluster (see the Configuration chapter);
  • k8s.secret.auto.tfvars: contains the password that will be used to access the nodes of the cluster via SSH (see the Configuration chapter). Note that this file is included only as an example, but, since it contains potentially sensitive information, it should not be stored in version control systems;
  • Butane (.bu) files in the config folder (see the Configuration chapter);
  • create.sh: simple example script to execute all the steps required for the creation of the cluster (see the Cluster creation chapter).

Another prerequisite is an extracted qcow2 image of Fedora CoreOS, that can be downloaded from the Fedora CoreOS official website. Download the QEMU version and uncompress the .xz file. The uncompressed file must be put in the images directory that is present in the cloned repository.

Tip: before creating a new cluster, download the latest version of Fedora CoreOS. In fact, when Fedora CoreOS starts up searches for updates and, if some are available, tries to install them. This could cause problems in the cluster creation process described below.

Configuration

The files in the repository can be directly used by only updating the name of the downloaded Fedora CoreOS image in the k8s.auto.tfvars file. The created cluster will have one Control Plane node and two worker nodes. The Container Runtime Interface (CRI) implementation is cri-o while the Container Network Interface (CNI) provider is Kube-Router.

If you want to modify the example configuration you can act on the Terraform variables and/or the Butane files.

Terraform variables

Most of the configuration parameters are contained in the k8s.auto.tfvars and k8s.secret.auto.tfvars files. Modify them according to your requirements (see the variables.tf file for a description of the declared variables).

The k8s.secret.auto.tfvars contains only the password of the admin user of all the nodes. It is used by Terraform to install the required software on the nodes and must match the password_hash configured in the Butane files of the nodes (see below).

If you want to change the number of worker nodes of the cluster see the Add/remove worker nodes chapter.

Butane files

The configuration of the virtual machines (nodes of the cluster) that is applied at the first boot is defined using Butane YAML-formatted files. For each VM there must be a Butane file in the config directory. Each file will be automatically converted by Terraform into an Ignition JSON file (.ign).

Important: Ignition is applied by Fedora CoreOS at first boot. Updating a .bu file and regenerating the .ign file does not change the configuration of an already-running VM. To apply Butane/Ignition changes you typically need to recreate the affected node, so it's probably better to plan ahead and get the configuration right before creating the cluster.

The main parameters in the Butane files that can be safely changed before applying the configuration with Terraform are the following:

  • hostname of the node: if desired, change the content of the file /etc/hostname;

  • password_hash: hash of the password that can be used to access the cluster nodes via SSH. The content of this field can be computed using the following command (the password must be the one configured in the k8s.secret.auto.tfvars file):

    mkpasswd --method=yescrypt
  • ssh_authorized_keys (optional): list of the SSH keys that can be used to access the cluster nodes via SSH. This parameter is optional, but it is advisable to set it in order to speed up the access to the nodes of the cluster (see: https://www.ssh.com/academy/ssh/authorized-key);

  • podSubnet (only in the clusterconfig.yml file definition present in the control-plane.bu file): subnet used by pods. Modify it if another set of IP addresses is desired for the pods.

Of course, you can change also the rest of the files in order, for example, to add other useful files/scripts to the nodes, configure services, modify the initialization scripts, etc..

Cluster creation

To create a new multi-node K8s cluster on your Linux machine the required steps are the following:

  1. Initialize Terraform:

    cd dev-multinode-k8s
    terraform init
  2. Modify the configuration files if needed (see the previous chapter);

  3. Apply the configuration with Terraform:

    terraform apply

    If you want to see what actions Terraform would take to apply the current configuration before applying it, you can do this with the following command:

    terraform plan
  4. Wait some seconds after the end of the apply command execution in order to give the virtual machines the time to restart;

  5. Initialize the Control Plane node: log in to the Control Plane node as the admin user (default IP: 192.168.40.162) and execute the script setup/init.sh:

    ssh admin@192.168.40.162
    cd setup
    ./init.sh
  6. Copy the join command from the script execution log on the Control Plane node. The following is an example of join command:

    kubeadm join 192.168.40.162:6443 --token dixcvq.c5l1ogpz2ttymfs2 \
        --discovery-token-ca-cert-hash sha256:ae1f5bdd5b8521f8ee842d3efc6796a83c755276d88dd340ab5f8f98a41fb968
  7. Initialize the worker nodes and add them to the cluster: log in to each worker node as the admin user and execute the script setup/init.sh:

    ssh admin@192.168.40.xxx
    cd setup
    ./init.sh

    Then execute with sudo the join command copied at the previous step to add the worker to the cluster. For example:

    sudo kubeadm join 192.168.40.162:6443 --token dixcvq.c5l1ogpz2ttymfs2 \
        --discovery-token-ca-cert-hash sha256:ae1f5bdd5b8521f8ee842d3efc6796a83c755276d88dd340ab5f8f98a41fb968
  8. That's it! Your multi-node K8s cluster is up and running! You can check the list of the nodes logging in to the Control Plane node and executing the command:

    kubectl get nodes
    
    NAME                STATUS   ROLES           AGE     VERSION
    k8s-control-plane   Ready    control-plane   2m23s   v1.34.3
    k8s-worker0         Ready    <none>          112s    v1.34.3
    k8s-worker1         Ready    <none>          81s     v1.34.3
    
    (or
        kubectl get nodes -o wide
     to get more details)

    If you have installed the Virtual Machine Manager on your host machine, you can see the running nodes (virtual machines) of your new cluster:

    Alternatively you can use the virsh command. For example, the command:

    virsh list

    prints the information about the existing virtual machines (referred to as domains in libvirt):

    Id    Name                State
    ----------------------------------
    1    k8s-control-plane   running
    2    k8s-worker0         running
    3    k8s-worker1         running

Note 1: if necessary, the join command required to add a node to the cluster can be obtained on the Control Plane node using the following command:

kubeadm token create --print-join-command

Note 2: the files in the repository include a simple example script named create.sh that executes all the steps described above automatically (except for the Terraform initialization command). Note that, to work correctly, such script requires that the latest version available of Fedora Core OS is used for the deploy. Moreover, if you've not set up SSH keys for the admin user in the Butane files, the script will prompt you for the password of the admin user several times.

Note 3: if the DNS doesn't work correctly, try to restart all the nodes of the cluster.

Note 4: if the created virtual machines do not have Internet access, here’s a checklist to troubleshoot common issues related to libvirt networking and firewall settings on the host machine:

  • Enable IP Forwarding: check if IP forwarding is enabled:

    sysctl net.ipv4.ip_forward

    If it returns 0, enable it temporarily:

    sysctl -w net.ipv4.ip_forward=1

    To make it permanent, edit /etc/sysctl.conf and add:

    net.ipv4.ip_forward = 1
    

    Then apply the changes:

    sudo sysctl -p
  • Add NAT rules:

    Allow forwarding in the libvirt zone:

    sudo firewall-cmd --zone=libvirt --add-forward --permanent

    Allow traffic from the custom k8snet network (defined in the k8s.auto.tfvars file) to be forwarded through the zone:

    sudo firewall-cmd --zone=libvirt --add-source=192.168.40.160/27 --permanent

    Ensure that NAT (masquerading) is enabled on the outgoing traffic to the Internet:

    sudo firewall-cmd --add-masquerade --permanent

    Reload firewalld to apply the settings:

    sudo firewall-cmd --reload

    To ensure everything is correctly configured, you can check the status of your zones and rules:

    sudo firewall-cmd --zone=libvirt --list-all
    sudo firewall-cmd --list-all

    Ensure that forwarding is enabled for the libvirt zone and that masquerading is applied to the default zone.

  • Docker Interference:

    If Docker is installed, it sets the default iptables forwarding policy to DROP, which blocks libvirt traffic. Fix it by running:

    sudo iptables -P FORWARD ACCEPT
  • DNS Issues:

    If ping 8.8.8.8 works but ping google.com fails, it's a DNS problem. Check if systemd-resolved is interfering or try forcing a DNS in the Terraform configuration.

Note 5: depending on how libvirt is configured on your host, Terraform may need elevated privileges (e.g., sudo) or your user must be allowed to talk to the system libvirt daemon (qemu:///system). If Terraform fails with libvirt permission/connection errors, check the items below.

  • Ensure your user can access qemu:///system

    Terraform uses the system libvirt connection (qemu:///system), not the per-user one (qemu:///session). To avoid running Terraform as root, add your account to the libvirt group and refresh your session:

    sudo usermod -aG libvirt $USER
    newgrp libvirt  # or log out and back in to refresh group membership
  • Ensure QEMU can read the generated ./config/*.ign files

    If a VM fails to start with errors like Failed to open file ... Permission denied while reading ./config/*.ign, remember that the QEMU process must be able to traverse the full directory path to those files (it needs execute permission (x) on each parent directory).

    A common pitfall: if your repo lives under /home/<user>/... and /home/<user> is 700, QEMU cannot reach the .ign files even when the files themselves are 644.

    To find which parent directory blocks access, use:

    namei -l <path-to-ign-file>

    A minimal fix is to grant traverse permission via ACL on your home directory (adjust if the blocking directory is deeper than /home/$USER):

    sudo setfacl -m u:qemu:--x /home/$USER
  • Ensure the default libvirt storage pool exists (Fedora)

    On Fedora, also ensure the default libvirt storage pool exists. Since Terraform uses the system connection, check the pool on qemu:///system:

    virsh -c qemu:///system pool-list --all

    If the default pool is missing, create it with the following commands:

    sudo virsh pool-define-as --name default --type dir --target /var/lib/libvirt/images
    sudo virsh pool-build default
    sudo virsh pool-start default
    sudo virsh pool-autostart default

    (Tip: you can set export LIBVIRT_DEFAULT_URI=qemu:///system in your shell to make virsh use the system connection by default).

    Running virsh pool-list --all afterwards should show the pool as active. This step ensures Fedora hosts without a pre-existing default pool still work with the example Terraform configuration.

Test deployment

After the cluster is created and initialized, you can deploy a simple test workload.

Run the following kubectl commands on the Control Plane node (where kubectl is installed and configured by the setup scripts). Alternatively, you can run them from the host machine if you have configured kubectl there (see Control the cluster from the host machine).

  1. Verify the nodes are Ready:

    kubectl get nodes
  2. Create an NGINX deployment (2 replicas):

    kubectl create deployment nginx-test --image=nginx:latest --replicas=2 --port=80
  3. Expose it via a NodePort Service:

    kubectl expose deployment nginx-test --type=NodePort --port=80
    kubectl get svc nginx-test

    Take note of the allocated port in the PORT(S) column (e.g. 80:3xxxx/TCP).

    If you prefer, you can extract the NodePort with:

    kubectl get svc nginx-test -o jsonpath='{.spec.ports[0].nodePort}'
  4. Test from the host (or from any machine that can reach the node IPs):

    • Pick any node IP (control-plane or worker). With the default example config, worker0 is 192.168.40.163.
    • Then run:
      NODE_IP=192.168.40.163
      NODE_PORT=$(kubectl get svc nginx-test -o jsonpath='{.spec.ports[0].nodePort}')
      curl -I "http://${NODE_IP}:${NODE_PORT}"
      You should see an HTTP 200 OK response.

    Alternatively, open your browser at:

    http://192.168.40.163:<NODE_PORT>
    

    and you should get the default NGINX welcome page.

To clean up the test resources:

kubectl delete svc nginx-test
kubectl delete deployment nginx-test

Add/remove worker nodes

Adding or removing worker nodes is straightforward and can be done by following the steps described below.

Remove worker nodes

Modify the workers_count parameter in the k8s.auto.tfvars file to a smaller value, then run the Terraform apply command:

terraform apply

Note that in this way the virtual machines corresponding to the removed nodes will be destroyed and this could have some unwanted impacts on the running pods. See Safely Drain a Node for recommendations on how to properly remove a node from the cluster.

Add new worker nodes

Modify the workers_count parameter in the k8s.auto.tfvars file to a higher value. If the value is greater than 3, you need to add the required nodes configurations in the same file (e.g.: worker3, worker4, etc.) and create the Butane files in the config folder for each worker with index greater than 2. In fact, in the repository there are only 3 Butane files for a maximum of 3 workers. If, for example, you want to create a cluster with 5 worker nodes, you will have to create the files worker3.bu and worker4.bu (note that the names of the files must have the format workerN.bu). Such files can have the same content as the others with only the Hostname changed.

Now you can run the Terraform apply command:

terraform apply

When the command execution terminates, follow the same steps described in the Cluster creation chapter to initialize and add the new worker nodes. To get the join command required to add the new nodes to the cluster, execute the following command on the Control Plane node:

kubeadm token create --print-join-command

Nodes configuration updates

There are two ways to modify the nodes configuration parameters (e.g.: vcpu number, memory, etc.). The first is using Terraform and the second is using libvirt (using the Virtual Machine Manager or the virsh command).

The preferred method is using Terraform to keep the configuration files and the deployed infrastructure aligned. Simply modify the parameters in the configuration files, then execute the apply command (terraform apply).

However, for some modifications, applying the new configuration with Terraform may replace the existing virtual machines with new ones. This can disrupt your cluster, requiring you to reinitialize one or more nodes. This is, at least in some cases, probably due to a limitation of the current Terraform provider for libvirt that may change in future versions.

Therefore, it is recommended to first check the planned modifications with:

terraform plan

If in the plan the number of destroyed resources is 0, it is safe to proceed and apply the new configuration. If such number is > 0, use libvirt.

Control the cluster from the host machine

If you want to control the cluster from the host machine, you should install the command-line tool kubectl and then copy the .kube/config file of the Control Plane node in the .kube directory (create it if not already present) in your home directory. The installation of the kubectl tool can be easily done using snap.

Deploy and access the Dashboard

To deploy and access the Kubernetes Dashboard see: https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/.

Destroy the cluster

To destroy the cluster removing all the created virtual machines, execute the following command:

terraform destroy

References

Releases

No releases published

Packages

No packages published