Search code examples
puppet

Puppet computing hash of (massive) unmanaged files


I am managing users with puppet with managehome set to true. This home directory is then populated with a few files (2 dot files in my case).

user { 'guillaume':
  ensure     => present,
  managehome => true,
}

file {'/home/guillaume':
  ensure  => present,
  purge   => false,
  recurse => true,
  source  => "puppet:///modules/${module_name}/home/${title}",
}

It is all fine and dandy, but I ended up putting a 25GB file in my home dir, which puppet was computing a hash of (at least it is my understanding. I could see from strace that the file was indeed fully read by puppet). It took about 20 minutes, for a full puppet run which should be done in less than a minute in theory. Removing the file made puppet run fast again, confirming my guess.

Why would puppet compute a hash of an unmanaged file, and how can I prevent sabotaging puppet by just putting such a (legit) file in a managed directory?


Solution

  • The reason Puppet is computing the checksum of the file in the home directory is because you are managing the contents of an entire directory recursively and that file is part of the directory's contents. There are a couple ways to improve your Puppet resources to avoid computing this checksum.

    The first is to just manage the two hidden files directly:

    user { 'guillaume':
      ensure     => present,
      managehome => true,
    }
    
    file {'/home/guillaume/.file_one':
      ensure  => file,
      source  => "puppet:///modules/${module_name}/home/.file_one",
      require => User['guillaume'],
    }
    file {'/home/guillaume/.file_two':
      ensure  => file,
      source  => "puppet:///modules/${module_name}/home/.file_two",
      require => User['guillaume'],
    }
    

    Note that above I also fixed the unspecified ensure value on the file resources and the missing dependency metaparameter of the file resources on the user resource.

    The second solution is to not recursively manage the contents of the directory and therefore ignore the files in the directory contents that are not being managed with the source attribute. You achieve this by setting the recurse attribute to remote:

    user { 'guillaume':
      ensure     => present,
      managehome => true,
    }
    
    file {'/home/guillaume':
      ensure  => directory,
      recurse => remote,
      source  => "puppet:///modules/${module_name}/home/guillaume",
      require => User['guillaume'],
    }
    

    Note that this makes the same fixes as the above solution.

    Some helpful documentation:
    https://puppet.com/docs/puppet/5.3/types/file.html
    https://puppet.com/docs/puppet/5.3/metaparameter.html#require