Skip to content

Conversation

@wnagele
Copy link
Contributor

@wnagele wnagele commented Nov 25, 2025

I've started this work to allow for the use of K8S (Kubernetes) nodes in topologies.

Containerlab does have support for it but it does pose some challenges on how to add it here.

The main challenge I have come across is that you cannot influence the naming of nodes. It does look like this is by design and won't change any time soon. It means that a certain naming pattern has to be followed for this to work.

For now topology configs based on this PR would look like this:

groups:
  unprovisioned:
    members: [ k01-control-plane, k01-worker, k01-worker2 ]
    module: []
    device: linux

nodes:
  k01-control-plane:
    clab:
      kind: k8s-control-plane
  k01-worker:
    clab:
      kind: k8s-worker
  k01-worker2:
    clab:
      kind: k8s-worker

Along with this a k01-config.yml would be needed:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
  - role: worker
  - role: worker

I am sure there is probably better ways to do this, but I am not familiar enough with the concepts here to suggest. Feel free to criticise or change as you see fit.

@jbemmel
Copy link
Collaborator

jbemmel commented Nov 25, 2025

Hi, thanks for your contribution!

Conceptually, what you're looking for is a (meta) device for k8s (like

description: Generic Cisco IOS device (meta device, used only as parent)
) and then child devices for control plane and worker nodes.

We may be able to derive k8s from the linux device - @ipspace what do you think?

The config YAML would come from a Jinja2 template rendered for each node

@ipspace
Copy link
Owner

ipspace commented Nov 26, 2025

Conceptually, what you're looking for is a (meta) device for k8s

No. The "meta" device is just a collection of settings so the "real" devices can inherit from a clean slate (all IOS devices from ios.yml, all Junos devices from junos.yml) not from another device which might have incompatible settings (for example, iol.yml from csr.yml)

We may be able to derive k8s from the linux device - @ipspace what do you think?

Long-term, another device might be a convenient way to replace a bunch of settings. Short-term, we have other fish to fry...

@ipspace
Copy link
Owner

ipspace commented Nov 26, 2025

I am sure there is probably better ways to do this, but I am not familiar enough with the concepts here to suggest. Feel free to criticise or change as you see fit.

Hard-coding edge cases into clab.j2 is not exactly the most scalable approach ;)

I would suggest adding "cluster" nodes (k01 in your case) to lab topology. They will get a node ID and might get IP addresses if you attach them to a link, but who cares. Make them unprovisioned and add a config_template to them (it has to be stored in templates/linux/somename.j2 because they're Linux devices) to create the cluster config based on the other nodes in your topology.

FWIW, I think you will need IP addresses on the "external" interfaces of the worker nodes, so in the end they can't be "unprovisioned" anymore, or you'll need some other mechanism to configure them (I just added "exec" to the list of valid clab keywords to support exactly this use case).

Long-term, we could create a "kind" plugin that would create the cluster- and CP/worker nodes automatically.

@ipspace
Copy link
Owner

ipspace commented Nov 26, 2025

I tried to create a proof-of-concept and crashed into another show-stopper: netlab doesn't like dashes in node names. How did you get around that @wnagele?

@wnagele
Copy link
Contributor Author

wnagele commented Dec 1, 2025

@ipspace have you tried my PR with the example config I posted up here? Basically clab handles these as external containers so like you said, other than the interfaces being added into them through clab/netlab the rest of the config inside of them has to be done using exec directives.

@ipspace
Copy link
Owner

ipspace commented Dec 1, 2025

@ipspace have you tried my PR with the example config I posted up here?

Of course I did:

┌──────────────────────────────────────────────────────────────────────────────────┐
│ CREATING configuration files                                                     │
└──────────────────────────────────────────────────────────────────────────────────┘
[ERRORS]  Errors found in topology.yml
[TYPE]    groups: attribute 'topology.groups.unprovisioned.members[1]' must be a 16-character identifier,
          found str
[HINT]    FYI: a 16-character identifier is a string starting with a letter or
          an underscore and containing up to 16 letters, numbers, or underscores
          use 'netlab show attributes node_group' to display valid attributes
[FATAL]   Cannot proceed beyond this point due to errors, exiting

What am I missing?

@wnagele
Copy link
Contributor Author

wnagele commented Dec 1, 2025

Ah, I used this to get around that:

defaults:
  const:
    MAX_NODE_ID_LENGTH: 32

@ipspace
Copy link
Owner

ipspace commented Dec 1, 2025

Ah, I used this to get around that:

🤦‍♂️

Of course. I should have realized that. Will try to improve the error message.

@ipspace
Copy link
Owner

ipspace commented Dec 1, 2025

OK, I got the cluster up and running with this topology (using unmodified clab.j2)

provider: clab

defaults:
  const:
    MAX_NODE_ID_LENGTH: 32

groups:
  unprovisioned:
    members: [ k01 ]
    module: []
    device: linux
  k-nodes:
    device: linux
    members: [ k01-control-plane, k01-worker, k01-worker2 ]

nodes:
  k01:
    clab:
      kind: k8s-kind
      startup-config: k01-config.yaml
      image: kindest/node:v1.34.0
  k01-control-plane:
    clab:
      kind: ext-container
  k01-worker:
    clab:
      kind: ext-container
  k01-worker2:
    clab:
      kind: ext-container

We could generate the startup-config with clab.config_template and then use the known file name in the clab.startup-config.

The real problem is the container name, which does not follow the standard containerlab container naming convention -- it contains no prefix -- so all netlab commands crash. We could add clab.hostname parameter to allow the users to specify their own container names, or hard-code new rules into netsim/provider/clab.py to use different names when the clab.kind is set to k8s-kind or ext-container.

@wnagele
Copy link
Contributor Author

wnagele commented Dec 8, 2025

I've done a bit more digging and testing around your suggestion. At this point it looks to me like chasing a rabbits tail.

If I modify the behaviour around the hostname, the next issue is that these containers are not using netns the way other containers from containerlab would (read - they don't have a netns).

Taking a step back, I think we have to live with the limitations outlined for K8S KinD clusters in the containerlab docs. This explains that the way to go for configuring things inside those containers are exec statements for now. Until this changes, anything we do here is likely going to be brittle and prone to fail in the future. So my suggestion is to go with this as a first pass and if there is sufficient folks using this further improvements could be made?

@wnagele wnagele force-pushed the dev branch 2 times, most recently from ca02ac5 to aebb735 Compare December 8, 2025 16:48
@wnagele
Copy link
Contributor Author

wnagele commented Dec 8, 2025

Rebased against your recent changes.

@ipspace
Copy link
Owner

ipspace commented Dec 12, 2025

If I modify the behaviour around the hostname, the next issue is that these containers are not using netns the way other containers from containerlab would (read - they don't have a netns).

🤦‍♂️

Taking a step back, I think we have to live with the limitations outlined for K8S KinD clusters in the containerlab docs. This explains that the way to go for configuring things inside those containers are exec statements for now.

Well, with #2905/#2907 I have a mechanism to execute bash scripts (rendered from configuration templates) on container nodes.

Considering everything you discovered, I'd go with a different device ("kind") or maybe even role, for example "cluster", to get hooks that could modify stuff in the background, and work from there. An alternate idea would be to give devices the ability to request plugins (because device modules don't have the hooks necessary to modify the topology).

Give me a week or two ;)

@wnagele
Copy link
Contributor Author

wnagele commented Dec 15, 2025

Sure, no rush - let me know if there is something I can help with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants