Search code examples
puppet

Is puppet efficient in synchronizing large files?


How efficient is puppet with handling large files? To give you a concrete example:

Let's assume we're dealing with configuration data (stored in files) in the order of gigabytes. Puppet needs to ensure that the files are up-to-date with every agent run.

Question: Is puppet performing some file digest type of operation beforehand, or just dummy-copying every config file during agent runs?


Solution

  • When using file { 'name': source => <URL> }, the file content is not sent through the network unless there is a checksum mismatch between master and agent. The default checksum type is md5.

    Beware of the content property for file. Its value is part of the catalog. Don't assign it with contents of large files via the file() or template() functions.

    So yes, you can technically manage files of arbitrary size through Puppet. In practice, I try to avoid it, because all of Puppet's files should be part of a git repo or similar. Don't push your tarballs inside there. Puppet can deploy them by other means (packages, HTTP, ...).