Skip to content

Safe copying of files from local to remote #35

@sideeffect42

Description

@sideeffect42

In cdist-ungleich an interesting discussion has come up in #331:
how can we safely and atomically transfer file from the local machine to the remote machine?

The scope of this problem is not limited to the __file type. It also applies to other types in cdist-conf, e.g. __config_file, __dot_file, __staged_file (via __file), __download and possibly others.

I think the discussion is worth pursuing as a properly designed solution will benefit all cdist users (who doesn't use __file?).

Requirements to a solution:

  1. use non-predictable temporary file names (no race conditions),
  2. should not fail if temporary directory is size constrained,
  3. must not leave the system in a broken state if the destination has insufficient space for the file,
  4. must not leave temp files in $destination (think of applications using include * configuration),
  5. replacement of the $destination must be atomic (in the sense of made in one SSH connection).

(Please extend if I forgot something.)

Proposals:

Common to all proposals is the division between code-local and code-remote: code-local copies the file to a remote temp location, code-remote moves the remote temp file to $destination and ensures the attributes are set correctly.

Copy to $TMPDIR using mktemp(1), then move to $destination.

This is how __file has worked in the past, albeit using mktemp -u.

pros: uses non-predictable file names (if mktemp is used on the target), no temp files in $destination.
cons: fails if the file is large and $TMPDIR is size constrained (e.g. tmpfs), may leave the system in a broken state if insufficient space in $destination.

Copy to remote ${__object}/files/tempfile, then move to $destination.

This solution uses completely predictable file names, but I don't think this is a problem because cdist's run directory is root-owned and 0700.
Also the remote $__object directory is used for a single cdist run only, so there shouldn't be any collisions possible.

pros: no temp files in $destination.
cons: may fail if cdist's run directory is space-constained, may leave the system in a broken state if insufficient space in $destination.

Copy to $destination.tmp.XXXXXX, then move to $destination.

Idea:
In code-local:

  1. Allocate a random temp file in $destination on the target safely, using mktemp,
  2. copy file to allocated temp file,
  3. store temporary file name in remote's $__object.

Then in code-remote:

  1. Set file attributes,
  2. move from $destination.tmp.XXXXXX (as stored in $__object) to $destination.

pros: non-predictable temp names, does not use $TMPDIR, does not fail if $destination has insufficient space (can't copy file in the first place)
cons: may leave temp files in $destination if code-* fails for some reason.

Decisions

  1. Can we rely on mktemp(1)?
    cdist is supposed to only rely on POSIX features and mktemp(1) isn't defined by POSIX.
    Furthermore mktemp(1) has no standardized interface. An invocation that works fine on Linux may fail or produce an unecpected result on e.g. OpenBSD.

    Or could we hack our own mktemp using /dev/random, tr and set -C?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions