Search code examples
puppet

Why would one use "resource collectors"?


I was looking at the OpenStack modules on Puppet Forge. These modules make use of "resource collectors" so I was reading up on "resource collectors" here: https://docs.puppet.com/puppet/latest/reference/lang_collectors.html

I still cannot figure out why one would need to use a resource collector?

Here's a example where the OpenStack/puppet-keystone module uses a resource collector:

if !is_service_default($memcache_servers) or !is_service_default($cache_memcache_servers) {
    Service<| title == 'memcached' |> -> Anchor['keystone::service::begin']
} 

I guess this would do resource ordering; causing the memcached service resource to execute before the keystone::service::begin Anchor. I don't really know what an Anchor is. I am guessing its used for resource ordering?


Solution

  • Resource collectors have several uses:

    • They realize virtual resources, or they collect exported ones, depending on the form of the collector. The realize() function for virtual resources plays in this space too, but there is no way other than collectors to collect exported resources. See also below.
    • They can be used in chain expressions, such as your example demonstrates, to set ordering constraints.
    • They can be used to override resource parameters.

    In each of these uses, collectors have properties that at times make them particularly convenient, among them:

    • Collectors operate over all matching resources in the catalog, including any that have not yet been declared at the time the collector expression is evaluated, regardless of the locus or scope of the declarations.
    • Collectors support filter predicates (e.g. title == 'memcached') that help fine tune which resources are collected. This can be particularly useful when combined with tags.
    • Collectors can collect zero resources, and that's ok.

    The last seems to be the point of the specific example you presented: since there can be at most one Service having title == 'memcached', the overall expression causes that service to be synced before Anchor['keystone::service::begin'] if it is included in the catalog, but it has no effect if no such service is declared, independent of manifest parse order. I don't think there is any other parse-order-independent way to accomplish this.