munin at CAL

munin is a system resource graphing tool. It has a plugin component architecture. We are using it at CAL to monitor and track load levels on the CAL servers.

Systems currently monitored include sedna and uranus.

Viewing the load levels is restricted to a select group of CAL users at the moment. If you think you should be able to see them, please open a ticket to request these privileges.

Architecture

munin, the host statistics gatherer, is implemented as a cron job on uranus. It periodically polls munin-node instances running on a series of hosts, stores the data in a convenient historical format, and builds a static directory of html pages, including graphs. It stores this directory in the filesystem on uranus, and the directory is then served as static html pages via apache2, which configured to require proper authenticated (kerberos) access to a select group of users.

The munin-node instances run on each of the monitored host, and listen on port 4949, which must be opened in the shorewall config. They are configured to only respond to the IP address of the monitoring server. Finally, all traffic between the monitored machines and monitoring server are wrapped in IPSEC's ESP encryption/authentication layer, so that a compromised network can't tamper with the data, or abuse the munin-node servers.