Bull GNU/Linux NFSv4 project

Administration of NFS and NFSv4 for Linux:

NFSv4 Monitoring

Version 0.2

2004/11/19

Author:
Frederic Jolly <frederic.jolly@ext.bull.net>

Reviewer

Tony Reix <tony.reix@bull.net>

Change summary

Version Number Date of Revision
0.2 2004/11/19
0.1 2004/11/15

Table of contents


Reference documents

Nagios Homepage: http://www.nagios.org/
Nagios plugin developpement page: http://sourceforge.net/projects/nagiosplug/
Nagios plugins developement guidelines: http://nagiosplug.sourceforge.net/developer-guidelines.html
Network Monitoring with Nagios and MRTG: http://nagios.sourceforge.net/download/contrib/documentation/misc/monitoring_nagios.doc


1. Requirements

We have identified several usefull informations that could be monitored in a classic network using NFS.
These informations can be separated into three sets:

We will present how to implement the monitoring of all these informations in the Nagios tool.

2. Nagios presentation

2.1 Nagios architecture

Nagios is a monitoring tool, widely used on linux networks.
Nagios is built on a server/agents architecture.
Usually, on a network, a Nagios server is running on a host, and plugins are running on all the remote hosts that need to be monitored. These plugins send informations to the server, which displays them in a GUI.

So Nagios is composed of three parts:

A soft alert is raised when a plugin returns a warning or an error.
Then on the GUI, a green button turns to red, and a sound is emitted.
When this soft alert is raised many times (the number is configurable), a hard alert is raised, and the Nagios server sends notifications: email, SMS, ...

2.2 Plugins

A plugin is a small program (in Perl, C, python, ...) that checks a service (a daemon, some free space on a disk, ...). It must return a value and a small line of text (Nagios will only grab the first line of text).

Output should be in the format: METRIC STATUS: information text|performance data
The allowed METRIC STATUS are 0 (OK), 1 (WARNING), 2 (CRITICAL) or 3 (UNKNOWN)

The warning and critical thresholds are parameters, set by the user, passed as arguments to the plugin.

A plugin can also return performance data in the format: "label1=value1 label2=value2 ..."
These data are stored by Nagios and may be later displayed with MRTG (http://people.ee.ethz.ch/~oetiker/webtools/mrtg/)

The plugins can be run:

3. Implementation in Nagios

3.1 Monitoring

Here we have both a classic monitoring (daemons up or down), and some checks on single host about the NFS mounts and exports.

3.2 Performances

The main complain about NFS on a big network is: "It's lagging".
So the performances must be checked in order to know if it is really an NFS host which is lagging, and which one it is.

3.3 Global view

We have seen that with NFS, is it sometimes difficult to remember all of the exports and mounts on a big network, with many NFS servers. It seems usefull to create a software (a NAGIOS view? an html view?) that can display a multihosts tree view or a multihosts table view of the network:

Then others informations may be added:

4 Other developments

Some points can not be done in Nagios or any monitoring tool, mainly because they need to be checked just before the mounting or during the mounting.
So they could be done, for example, in Webmin (http://webmin.com) in the Mount module:


Page maintained by: Frederic Jolly
Accessed times since its creation.
 
Last update: 2005, February 11