-
Notifications
You must be signed in to change notification settings - Fork 114
Support image tar, without accessing Docker daemon #256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@lyon-v I've fixed the runners and rebased. However I think you need to run |
|
@rnc Hi there! Apologies for the delayed response. I've fixed the code formatting and it has passed the checks now. |
rnc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I think this is a great idea. However there are some areas I have questions/comments on. And tests are also required please. Thanks very much for the PR!
README.rst
Outdated
|
|
||
| :: | ||
|
|
||
| $ python -m docker_squash.cli --input-tar source.tar --tag jboss/wildfly:squashed -f 8 --output-path squashed.tar --load-image false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be run without the -f parameter as the log out below has the squashed image larger than the original which is a confusing result for a README. Also, if both docker squash and tar squash have an example showing the same result IMHO its more inituitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because, jboss/wildfly:latest this image has changed.
(base) root@master:~# docker pull jboss/wildfly:latest
latest: Pulling from jboss/wildfly
f87ff222252e: Pull complete
8116b2f7ca5a: Pull complete
0b43aea4eeb1: Pull complete
13776e8da872: Pull complete
f26d32e28c29: Pull complete
Digest: sha256:35320abafdec6d360559b411aff466514d5741c3c527221445f48246350fdfe5
Status: Downloaded newer image for jboss/wildfly:latest
docker.io/jboss/wildfly:latest
(base) root@master:~# docker history jboss/wildfly:latest
IMAGE CREATED CREATED BY SIZE COMMENT
35320abafdec 3 years ago /bin/sh -c #(nop) CMD ["/opt/jboss/wildfly/… 0B
3 years ago /bin/sh -c #(nop) EXPOSE 8080 0B
3 years ago /bin/sh -c #(nop) USER jboss 0B
3 years ago /bin/sh -c #(nop) ENV LAUNCH_JBOSS_IN_BACKG… 0B
3 years ago /bin/sh -c cd $HOME && curl -L -O https:… 270MB
3 years ago /bin/sh -c #(nop) USER root 0B
3 years ago /bin/sh -c #(nop) ENV JBOSS_HOME=/opt/jboss… 0B
3 years ago /bin/sh -c #(nop) ENV WILDFLY_SHA1=238e67f4… 0B
3 years ago /bin/sh -c #(nop) ENV WILDFLY_VERSION=25.0.… 0B
4 years ago /bin/sh -c #(nop) ENV JAVA_HOME=/usr/lib/jv… 0B
4 years ago /bin/sh -c #(nop) USER jboss 0B
4 years ago /bin/sh -c yum -y install java-11-openjdk-de… 239MB
4 years ago /bin/sh -c #(nop) USER root 0B
4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldmann… 0B
4 years ago /bin/sh -c #(nop) USER jboss 0B
4 years ago /bin/sh -c #(nop) WORKDIR /opt/jboss 0B
4 years ago /bin/sh -c groupadd -r jboss -g 1000 && user… 406kB
4 years ago /bin/sh -c yum update -y && yum -y install x… 33.5MB
4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldmann… 0B
5 years ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
5 years ago /bin/sh -c #(nop) LABEL org.label-schema.sc… 0B
5 years ago /bin/sh -c #(nop) ADD file:61908381d3142ffba… 222MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix the readme.rst
| parser.add_argument( | ||
| "--input-tar", | ||
| help="Path to tar file created by 'docker save'. Process tar file directly without requiring Docker daemon.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should investigate using exclusive groups for argparse - as that has built in support for having either the --input-tar or image option and would avoid the manual checks below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - I think its valid for output-path to be the same as input-tar (?) , should, in tar mode, this be the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great ! I have the code changes.
|
|
||
| def __init__( | ||
| self, log, tar_path, from_layer=None, tmp_dir=None, tag=None, comment="" | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TarImage derives from Image (which is good) but isn't calling super. Further I think it duplicates some code from image.py (and potentially v2_image). Could there be more attempt at normalising the code to avoid duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, sir. I will fix this
| - Works in CI/CD pipelines and restricted environments | ||
| - Supports both Docker format and OCI format images | ||
| - Maintains complete layer history compatibility | ||
| - Can process images on systems where Docker is not installed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would imagine that its helpful when working with podman as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely! That's a great point. The --input-tar feature is indeed very helpful for Podman users.
Since Podman uses podman save to export images in the same tar format as docker save, users can now:
# Export image with Podman podman save myimage:latest -o image.tar # Squash with docker-squash (no Docker daemon required) docker-squash --input-tar image.tar --tag myimage:squashed --output-path squashed.tar # Import back to Podman podman load -i squashed.tarThis workflow is particularly valuable in environments where:
- Only Podman is available (no Docker daemon)
- Running in CI/CD pipelines with Podman
- Working in rootless containers or restricted environments
- Processing images offline without any container runtime
Should I add a Podman example to the documentation to highlight this use case?
| self.log.info("Detected Docker format image") | ||
| self.oci_format = False | ||
| else: | ||
| raise SquashError("Unable to detect image format - missing manifest files") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this duplicating v2_image::_get_manifest ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right! There is indeed duplication with v2_image::_get_manifest. Both methods:
- Check for index.json to detect OCI format
- Set self.oci_format = True/False
- Handle manifest file reading
I should refactor this to reuse the existing logic. A few options:
Option 1: Extract common logic to base class
# In Image base class def detect_image_format(self): if os.path.exists(os.path.join(self.old_image_dir, "index.json")): self.oci_format = True return "oci" elif os.path.exists(os.path.join(self.old_image_dir, "manifest.json")): self.oci_format = False return "docker" else: raise SquashError("Unable to detect image format")
Option 2: Have TarImage reuse v2_image's get_manifest
# In TarImage def detect_image_format(self): try: # This will set self.oci_format as a side effect self.manifest = self.get_manifest() # Inherit from v2_image logic except SquashError: raise SquashError("Unable to detect image format")
I lean toward Option 1 as it's cleaner separation of concerns. What do you think?
| self.log.info( | ||
| "💡 Tip: Consider using --tag to specify a name for your squashed image" | ||
| ) | ||
| self.log.info(" Example: --tag myimage:squashed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a tag make sense for an output tar? It is probably of only relevance if --load-image has been specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I respectfully disagree with this assessment. The --tag parameter is meaningful for output tar files regardless of the --load-image setting, here's why:
Tag is part of image metadata in tar format:
- Docker/Podman tar format stores tags in manifest.json under RepoTags field
- This metadata becomes part of the squashed tar file
Tag is useful in all scenarios:
- --load-image true: Image gets loaded with the specified tag
- --load-image false + --output-path: The output tar contains tag metadata, so when someone later runs docker load -i squashed.tar, the image will have the proper tag
- Distribution: Tagged tar files are more useful when shared with others
Without --tag, the consequences are significant:
# Without tag - image loads but has no name $ docker load -i squashed.tar Loaded image ID: sha256:abc123... $ docker images REPOSITORY TAG IMAGE ID <none> <none> sha256:abc123... # Hard to identify! # With tag - much more usable $ docker load -i squashed.tar Loaded image: myapp:squashed $ docker images REPOSITORY TAG IMAGE ID myapp squashed sha256:abc123... # Clear identificationThe tip message encourages good practices for tar-based workflows, not just --load-image scenarios. The tag becomes part of the portable tar artifact.
|
@lyon-v Did you wish to discuss any of the comments? |
|
sir, my apologies for the slow response. I've been swamped with work lately, but I'll reply to or fix these issues shortly. |
Enable Docker Daemon-Free Image Squashing
This PR directly addresses and resolves Issue #24: "Make it possible to run squashing without accessing Docker daemon" by introducing the ability for
docker-squashto directly process Docker images from tar files. This eliminates the need for a running Docker daemon, significantly enhancing flexibility for image optimization in CI/CD pipelines, air-gapped environments, and systems without Docker installed.Key Benefits
How to Use
Export the Image:
Squash from Tar:
--input-tarfor your source image file.--tagis recommended for the new image name.--output-pathspecifies where to save the squashed tar file.--load-image falseprevents the tool from attempting to load the image directly into a Docker daemon.Load into Docker (Optional):
This enhancement significantly broadens
docker-squash's utility, making image size optimization more accessible across diverse development and deployment scenarios.