-
Notifications
You must be signed in to change notification settings - Fork 17
Problem disabling monitors or removing hosts #25
Description
I am running Foreman+Puppet+PuppetDB on the same host.
There are strange cases where services or hosts are left dangling in Nagios after being disabled or removed from target node's classification:
Scenario 1:
Node has a service monitored like OpenSSH. Puppet creates the nagios service monitor for the Node. Suppose due to an infrastructure firewall, Nagios can no longer reach node:tcp_22, and I wish to set openssh_monitor = false. Puppet run on the nagios server produces the following output:
notice: /File[/etc/nagios/auto.d/services/buildtest.example.edu-openssh_process.cfg]/owner: owner changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest.example.edu-openssh_process.cfg]/group: group changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest.example.edu-openssh_process.cfg]/mode: mode changed '644' to '755'
notice: /File[/etc/nagios/auto.d/services/buildtest.example.edu-openssh_tcp_22.cfg]/owner: owner changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest.example.edu-openssh_tcp_22.cfg]/group: group changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest.example.edu-openssh_tcp_22.cfg]/mode: mode changed '644' to '755'
After this action the puppet run completes, with no service restart/reload of the nagios service. Even if there was, the service files are still readable by the nagios user and still get parsed upon service restart.
Scenario2:
Node is removed from Foreman+Puppet+PuppetDB via Foreman Delete + puppet node deactivate . Nagios host and service monitors are left hanging on the nagios server. Puppet run on the nagios server looks similar to the above, but also changes host configs:
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-00-baseservices.cfg]/owner: owner changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-00-baseservices.cfg]/group: group changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-00-baseservices.cfg]/mode: mode changed '644' to '755'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-nrpe_process.cfg]/owner: owner changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-nrpe_process.cfg]/group: group changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-nrpe_process.cfg]/mode: mode changed '644' to '755'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-openssh_process.cfg]/owner: owner changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-openssh_process.cfg]/group: group changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-openssh_process.cfg]/mode: mode changed '644' to '755'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-openssh_tcp_22.cfg]/owner: owner changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-openssh_tcp_22.cfg]/group: group changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/services/buildtest2.example.edu-openssh_tcp_22.cfg]/mode: mode changed '644' to '755'
notice: /File[/etc/nagios/auto.d/hosts/buildtest2.example.edu.cfg]/owner: owner changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/hosts/buildtest2.example.edu.cfg]/group: group changed 'root' to 'nagios'
notice: /File[/etc/nagios/auto.d/hosts/buildtest2.example.edu.cfg]/mode: mode changed '644' to '755'
notice: Finished catalog run in 3.60 seconds
Again, no service nagios restart/reload within the puppet run.
The only resolution is to manually delete the files modified and reload nagios.