alphaspirit - Fotolia


How the Nagios monitoring tool tracks IT environment details

Nagios users can monitor diverse components of an IT infrastructure, across Linux and Windows OSes, networks and servers, so long as they roll up their sleeves and dig into agents and plug-ins.

IT monitoring depends on software tools that probe, analyze, predict and report on what is often an extensive, heterogeneous environment.

The Nagios monitoring tool operates within IT infrastructures to oversee servers, applications, services, network devices and other components. It offers detailed reports and alerting so that administrators can determine when problems occur and when operations are back to spec. Nagios provides a high level of extensibility via plug-ins.

Nagios Core is an open source distribution focused on the essential tasks of infrastructure monitoring and support for plug-ins. Administrators can also define parent/child device mappings to help troubleshoot. Plug-ins create customized service checks and visualizations.

Nagios XI builds on Nagios Core to provide a commercial distribution fit for production IT deployment. Nagios XI has configuration wizards and enables admins to customize their interface's dashboards and views. Reporting and visualization capabilities support granular and detailed infrastructure monitoring activities, as well as notification escalation. The enterprise edition of Nagios XI includes capacity planning, audit logging, service-level agreement reports, bulk functionality and automation capabilities.

Use Nagios agents for the heavy lifting

Nagios is capable of agentless monitoring, but most users rely on agents to collect and deliver detailed data about servers, services and applications. Agents help detect problems, which enhances the organization's troubleshooting efforts.

Major Nagios agents include Nagios Cross-Platform Agent, NSClient++, Nagios Remote Plugin Executor and Nagios Remote Data Processor.

Nagios Cross-Platform Agent (NCPA) is an open source API-based agent for Windows, Linux and Mac OSes. NCPA primarily performs low-level checks, including processor, memory, disk, process, service and network utilization. NCPA does active checks, wherein a Nagios server pings devices on demand and listens for responses, but it can also set passive checks to gather agent data as it arrives from various devices. This Nagios agent has a web-based GUI and a graphing API to visualize statistical data.

NSClient++ is a full-featured open source monitoring agent that bears similarities to NCPA and requests low-level metrics checks, such as processor and memory utilization, from a system on demand; gathers metrics as they are delivered from monitored systems; and helps with troubleshooting and other issue identification and resolution. The agent works with Nagios, NetEye Opsview and other monitoring tools.

Nagios Remote Plugin Executor (NRPE) is a basic agent that supports Nagios plug-in execution on remote host systems. NRPE runs in the background on the remote host and looks for requests from the Nagios server. Once the request arrives, NRPE runs the intended task and returns the results to the Nagios server.

Nagios Remote Data Processor (NRDP) offers a versatile data transport mechanism and processor for Nagios. NRDP enables remote agents, applications and Nagios servers to send commands to a system, gather results and send those results to the Nagios server. Administrators often use NRDP as an alternative to Nagios Service Check Acceptor to implement distributed monitoring, passive checks and remote control of a monitoring environment.

Use Nagios plug-ins

The plug-in architecture in the Nagios monitoring tool enables it to work across varied systems, services, OSes and other heterogeneous infrastructure components.

Plug-ins often accompany agents, and typically, agents rely on corresponding plug-ins installed on the Nagios server. For example, the NRPE agent runs under control of the check_nrpe plug-in and the NSClient++ agent might rely on the check_nt plug-in to collect information from Windows systems.

Plug-ins also serve myriad other purposes to enable or enhance the Nagios monitoring tool's capabilities in areas such as cloud and hardware support, backup and recovery, network protocols, OSes, remote access, clustering, high availability and web servers. For example, Windows administrators can set up a Nagios plug-in to check on their Active Directory domain controller services, look for expired certificates or monitor other areas. Similarly, network administrators can rely on a Nagios network plug-in to check port status, spanning tree protocol, traffic and error data from network interfaces.

Nagios users can obtain plug-ins and other enhancements from Nagios Exchange.

Basic monitoring setups

The Nagios monitoring tool is designed with a centralized approach for data collection and processing. This presents implications for the Nagios server and network bandwidth support.

Nagios can also demand a considerable amount of effort to configure the monitoring tool. For example, using Nagios to monitor a Windows system starts with agent configuration, such as NCPA, on the target Windows system. If administrators prefer active checks, they must install the check_ncpa plug-in on the Nagios server to create commands and definitions to monitor the Windows system and restart the plug-in. If they prefer passive checks, they can deploy the NRDP agent to the target system and add the corresponding checks to the NCPA configuration, as well as configure NRDP settings. They then repeat this process for each similar Windows system that Nagios will monitor.

Monitoring a Linux system with Nagios is simpler, depending on the user's goals and preferences. For example, administrators can use Secure Socket Shell (SSH) keys and the check_by_ssh plug-in to run plug-ins on remote Linux servers. However, the way in which SSH connections work adds tremendous monitoring overhead and could reduce monitoring performance. It is usually easier to use the NRPE agent to monitor remote Linux systems, particularly for low-level metrics, such as processor, memory and disk utilization.

Dig Deeper on IT systems management and monitoring

Software Quality
App Architecture
Cloud Computing
Data Center