Skip to content

Conversation

@travisbcotton
Copy link
Contributor

Currently every repo, package, and package group install happens outside the container. This forces you to use the same package manager as the build container (i.e. if you want to use dnf then you must use dnf in the build container).

This change should make it possible to build images with zypper or dnf (and make it easier to add other package managers) regardless of the build container's package manager. The only restriction is that when the parent is "scratch". This will call the scratch based installs which will use the build containers package manager to bootstrap the image.

@njones-lanl
Copy link
Collaborator

lgtm

@davidallendj
Copy link
Contributor

davidallendj commented May 21, 2025

Do you have a way to test this?

@travisbcotton
Copy link
Contributor Author

travisbcotton commented May 21, 2025

Sure...
clone it:

git clone -b trcotton/layer-refactor https://github.com/OpenCHAMI/image-builder.git

change to image-builder:

cd image-builder/

build it:

podman build -t image-builder:test -f dockerfiles/dnf/Dockerfile .

create a test folder:

mkdir test

Make a test config in ./test called suse.yaml that looks like:

options:
  layer_type: 'base'
  name: 'suse-base'
  publish_tags:
    - '15.7'
  pkg_manager: 'zypper'
  parent: 'registry.suse.com/bci/bci-base:15.7'
  publish_local: true

packages:
  - cloud-init
  - python3
  - vim
  - chrony

Run it:

podman run \
  --device /dev/fuse \
  -it \
  --name image-builder \
  --rm -v $PWD/tests:/data image-builder:test \
  image-build --log-level INFO --config /data/suse.yaml

@alexlovelltroy
Copy link
Member

alexlovelltroy commented May 22, 2025

Not working for me.

Error:

ERROR - Error installing packages: Command '['dnf', '--setopt=reposdir=/home/builder/.local/share/containers/storage/overlay/af5c17cb3072571c3a62c5968fa673310d3341295d81c5b8dd635437183420ab/merged/etc/yum.repos.d', '--setopt=logdir=/var/tmp/image-build-_l3abmur/dnf/log', '--setopt=cachedir=/var/tmp/image-build-_l3abmur/dnf/cache', 'groupinstall', '-y', '--nogpgcheck', '--installroot', '/home/builder/.local/share/containers/storage/overlay/af5c17cb3072571c3a62c5968fa673310d3341295d81c5b8dd635437183420ab/merged', 'Minimal Install', 'Development Tools']' returned non-zero exit status 1.
INFO - 4059a424d071151a5369f3dd6d9aef0aef1d6ec986c9bcd02e9f4caccf339e4b

Image definition:

options:
  layer_type: 'base'
  name: 'rocky-base'
  publish_tags: '9.5'
  pkg_manager: 'dnf'
  parent: 'scratch'
  publish_registry: 'demo.openchami.cluster:5000/demo'
  registry_opts_push:
    - '--tls-verify=false'

repos:
  - alias: 'Rocky_9_BaseOS'
    url: 'https://dl.rockylinux.org/pub/rocky/9/BaseOS/x86_64/os/'
    gpg: 'https://dl.rockylinux.org/pub/rocky/RPM-GPG-KEY-Rocky-9'
  - alias: 'Rocky_9_AppStream'
    url: 'https://dl.rockylinux.org/pub/rocky/9/AppStream/x86_64/os/'
    gpg: 'https://dl.rockylinux.org/pub/rocky/RPM-GPG-KEY-Rocky-9'

package_groups:
  - 'Minimal Install'
  - 'Development Tools'

packages:
  - kernel
  - wget
  - dracut-live

cmds:
  - cmd: 'dracut --add "dmsquash-live livenet network-manager" --kver $(basename /lib/modules/*) -N -f --logfile /tmp/dracut.log 2>/dev/null'
    loglevel: INFO
  - cmd: 'echo DRACUT LOG:; cat /tmp/dracut.log'
    loglevel: INFO

@davidallendj
Copy link
Contributor

I think got a different error running...

podman run \
  --device /dev/fuse \
  -it \
  --name image-builder \
  --rm -v $PWD/test:/data image-builder:test \
  image-build --log-level INFO --config /data/suse.yaml

The error:

-------------------BUILD LAYER--------------------
ERROR - Trying to pull registry.suse.com/bci/bci-base:15.7...
ERROR - Getting image source signatures
ERROR - Copying blob sha256:974449b21a84067bddbf51f286c6ba10084622e04303a6933c31d6af3f7eb475
ERROR - Error: copying system image from manifest list: writing blob: adding layer with blob "sha256:974449b21a84067bddbf51f286c6ba10084622e04303a6933c31d6af3f7eb475": processing tar file(potentially insufficient UIDs or GIDs available in user namespace (requested 0:15 for /etc/shadow): Check /etc/subuid and /etc/subgid if configured locally and run "podman system migrate": lchown /etc/shadow: invalid argument): exit status 1
Error building layer: Command '['buildah', 'from', '--name', 'suse-base20250522222318', 'registry.suse.com/bci/bci-base:15.7']' returned non-zero exit status 1.

@travisbcotton travisbcotton marked this pull request as draft June 3, 2025 13:03
@alexlovelltroy
Copy link
Member

Testing fails for me with ghcr.io/openchami/image-build:pr-17

chown: changing ownership of '/home/builder/config.yaml': Permission denied
Error: the file 'config.yaml' does not exist.

Confirmed working with ghcr.io/openchami/image-build:latest

@travisbcotton
Copy link
Contributor Author

How are you running it?

@travisbcotton
Copy link
Contributor Author

This commit: ae53d30
Is not going to work. You need a subuid/subgid range for the builder user in the container

@travisbcotton travisbcotton force-pushed the trcotton/layer-refactor branch 4 times, most recently from 20236bf to b5651bf Compare July 13, 2025 16:30
--cap-add=SETGID \
--security-opt seccomp=unconfined \
--security-opt label=disable \
--userns=keep-id \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--userns=keep-id \
--userns=keep-id:uid=1000,gid=1000 \

I've been using this to ensure my outside UID is mapped to builder so that builder can access things like 0600 files.

@treydock
Copy link
Contributor

treydock commented Aug 5, 2025

Tried testing and got this:

Error: crun: the path `/entrypoint.sh` exists but it is not executable: Operation not permitted: OCI permission denied
make: *** [Makefile:43: rhel-9-base] Error 126

This was my command:

podman run --rm --device /dev/fuse --network host --userns keep-id:uid=1000,gid=1000 \
-v /home/tdockendorf/image-builder-images:/data -v /var/tmp:/var/tmp -v ~/.config/containers/auth.json:/auth.json \
ghcr.io/openchami/image-build:pr-17 image-build --config /data/rhel-9-base.yaml --log-level DEBUG

…image (not scratch) then use the package manager installed in the image

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
travisbcotton and others added 13 commits August 28, 2025 14:49
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
… build container to work better with subuid/subgid mapping

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
This reverts commit 3b6813d.

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
This reverts commit 5e67b00.

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
This reverts commit eba24b2.

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Co-authored-by: treydock <treydock@gmail.com>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
@travisbcotton travisbcotton force-pushed the trcotton/layer-refactor branch from 9de355d to 2d960dc Compare August 28, 2025 20:49
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
@travisbcotton travisbcotton marked this pull request as ready for review September 16, 2025 14:27
@synackd
Copy link
Contributor

synackd commented Sep 16, 2025

I built this with:

buildah bud -f dockerfiles/dnf/Dockerfile.el9 -t ghcr.io/openchami/image-build:pr17 .

and it builds. But there's a Python import error when I try running the container:

$ podman run --rm --device /dev/fuse -e S3_ACCESS=<user> -e S3_SECRET=<pass> -v ./rocky-base-9.5.yaml:/home/builder/config.yaml:Z ghcr.io/openchami/image-build:pr17 image-build --config config.yaml --log-level DEBUG
Traceback (most recent call last):
  File "/usr/local/bin/image-build", line 8, in <module>
    from layer import Layer
  File "/usr/local/bin/layer.py", line 6, in <module>
    from utils import cmd, run_playbook
  File "/usr/local/bin/utils.py", line 9, in <module>
    from ansible.config.manager import ConfigManager, Setting
ImportError: cannot import name 'Setting' from 'ansible.config.manager' (/usr/local/lib/python3.11/site-packages/ansible/config/manager.py)

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
@synackd
Copy link
Contributor

synackd commented Sep 17, 2025

Tests

✅ Building Rocky 9 image from scratch

options:
  layer_type: 'base'
  name: 'rocky-base'
  publish_tags: '9.5'
  pkg_manager: 'dnf'
  parent: 'scratch'
  publish_registry: '172.16.0.254:5050/test'
  registry_opts_push:
    - '--tls-verify=false'

repos:
  - alias: 'Rocky_9_BaseOS'
    url: 'http://<dist_server>/repo/pub/rocky/9/BaseOS/x86_64/os'
    gpg: 'http://<dist_server>/gpg/RPM-GPG-KEY-Rocky-9'
  - alias: 'Rocky_9_AppStream'
    url: 'http://<dist_server>/repo/pub/rocky/9/AppStream/x86_64/os'
    gpg: 'http://<dist_server>/gpg/RPM-GPG-KEY-Rocky-9'
  - alias: 'Rocky_9_CRB'
    url: 'http://<dist_server>/repo/pub/rocky/9/CRB/x86_64/os'
    gpg: 'http://<dist_server>/gpg/RPM-GPG-KEY-Rocky-9'
  - alias: 'Epel'
    url: 'http://<dist_server>/repo/pub/rocky/epel/9/Everything/x86_64'
    gpg: 'http://<dist_server>/gpg/RPM-GPG-KEY-EPEL-9'

package_groups:
  - 'Minimal Install'
  - 'Development Tools'

packages:
  - kernel
  - wget
  - dracut-live
  - kitty-terminfo

cmds:
  - cmd: 'dracut --add "dmsquash-live livenet network-manager" --kver $(basename /lib/modules/*) -N -f --logfile /tmp/dracut.log 2>/dev/null'
  - cmd: 'echo DRACUT LOG:; cat /tmp/dracut.log'

❌ Building from parent image

options:
  layer_type: 'base'
  name: 'compute-base'
  publish_tags:
    - '9.5'
  pkg_manager: 'dnf'
  parent: '172.16.0.254:5050/test/rocky-base:9.5'
  registry_opts_pull:
    - '--tls-verify=false'

  # Publish to local S3
  publish_s3: 'http://172.16.0.254:9090'
  s3_prefix: 'compute/base/'
  s3_bucket: 'boot-images'

  # Publish to SI registry
  publish_registry: '172.16.0.254:5050/test'
  registry_opts_push:
    - '--tls-verify=false'

packages:
  - cloud-init
  - python3
  - vim
  - nfs-utils
  - chrony
  - cmake3
  - dmidecode
  - dnf
  - efibootmgr
  - golang
  - ipmitool
  - jq
  - make
  - perf
  - rsyslog
  - sqlite
  - sudo
  - tcpdump
  - traceroute
  - nss_db
  - lua-posix
  - tcl
  - git
  - fortune-mod

Error:

ERROR - Error installing packages: Installer.install_package_groups() takes 2 positional arguments but 4 were given

@synackd
Copy link
Contributor

synackd commented Sep 18, 2025

The call to install_package_groups here:

inst.install_package_groups(package_groups, repo_dest, proxy)

includes repo_dest and proxy as arguments. Does the definition here:

def install_package_groups(self, package_groups):

need to be updated with those?

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Copy link
Contributor

@synackd synackd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests above pass for DNF. LGTM.

@travisbcotton travisbcotton merged commit dac6547 into main Sep 22, 2025
2 checks passed
@synackd synackd deleted the trcotton/layer-refactor branch September 26, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants