I will try a brief version first, then i can add more information as requested.
I have a client machine with the following configuration:
------------------------------------------------------------
Connected to puppet-client-10 as root
Debian 7.8 wheezy (amd64)
------------------------------------------------------------
FQDN : puppet-client-10.mydomain
IP : 161.148.1.10
PuppetMaster: puppet-master.mydomain
Puppet : 3.7.5
Facter : 2.2.0
------------------------------------------------------------
Connecting to the below puppetmaster:
------------------------------------------------------------
Connected to puppet-master as root
Debian 7.8 wheezy (amd64)
------------------------------------------------------------
FQDN : puppet-master.mydomain
IP : 161.148.1.1
Puppet : 3.7.5
Facter : 2.4.3
------------------------------------------------------------
Now, back to the client. I used to have the agent disabled, and checking updates via cron once a day.
6 22 * * * root /usr/bin/puppet agent --test --logdest syslog
Works flawlessly.
2 days ago i commented the cron job and enabled the agent to check for updates every hour.
Then, the logs started showing this line every 2 minutes
<27>1 2015-05-20T08:20:30.651767-03:00 puppet-client-10 puppet-agent 8072 - - Could not request certificate: getaddrinfo: Name or service not known
<27>1 2015-05-20T08:22:30.668988-03:00 puppet-client-10 puppet-agent 8072 - - Could not request certificate: getaddrinfo: Name or service not known
Also, is showing that the client is correctly checking the master for updates
<28>1 2015-05-20T08:23:44.927447-03:00 puppet-client-10 puppet-agent 31500 - - Loading class elasticsearch
<28>1 2015-05-20T08:23:45.406158-03:00 puppet-client-10 puppet-agent 31500 - - Loading class logstash
<28>1 2015-05-20T08:23:45.776948-03:00 puppet-client-10 puppet-agent 31500 - - Loading class logrotate
<28>1 2015-05-20T08:23:46.204161-03:00 puppet-client-10 puppet-agent 31500 - - Loading class puppet
And then, back to the getaddrinfo error every 2 minutes
<27>1 2015-05-20T08:24:30.676307-03:00 puppet-client-10 puppet-agent 8072 - - Could not request certificate: getaddrinfo: Name or service not known
<27>1 2015-05-20T08:26:30.683570-03:00 puppet-client-10 puppet-agent 8072 - - Could not request certificate: getaddrinfo: Name or service not known
It keeps alternating between the error (every 2 minutes) and success (every hour) messages.
Executing the command puppet agent --test
works, as expected.
The problem seems to be on the agent.
Any hints?
i would guess it is because your puppet master isn't named "puppet". Also I'd check what user the puppet agent you now have running is running as, probably not root I'd guess – Vorsprung
it is named puppet-master
, also puppet-master.mydomain
, and with the below alt names
# puppet cert list puppet-master.mydomain
+ "puppet-master.mydomain" (SHA256) F2:54:03:9C
(alt names: "DNS:puppet", "DNS:puppet.mydomain", "DNS:puppet-master.mydomain")
It is running as root
# ps aux | grep puppet
root 1763 0.0 0.2 133776 45236 ? Ssl Mai19 0:07 /usr/bin/ruby /usr/bin/puppet agent
root 8072 0.0 0.2 194580 40144 ? Ssl Mai19 0:02 /usr/bin/ruby /usr/bin/puppet agent
Right now, 8072
above is the process spamming the error line.
Should i really have 2 processes running?
The error indicates an issue resolving a hostname to an IP, but given it succeeds every hour and also succeeds manually I don't think you have any configuration issues with your name resolution.
You should only have a single puppet-agent process running, I would stop the puppet-agent service, ensure that all of the processes have been killed, restart the puppet-agent service and ensure that only one process is running.
My bet is on one of those processes doing something silly.