本文共 4695 字,大约阅读时间需要 15 分钟。
监控内部信息,需要通过客户端的NRPE 插件收集内部信息,服务器通过check_nrpe 接收客户端的信息。
一、配置客户端
[root@server nrpe-2.12]# groupadd nagios
[root@server nrpe-2.12]# useradd -g nagios -s /sbin/nologin nagios [root@server libexec]# chown nagios.nagios /usr/local/nagios [root@server libexec]# chown -R nagios.nagios /usr/local/nagios/libexec 1. 安装nrpe 插件 [root@server ~]# tar zxvf nrpe-2.12.tar.gz [root@server ~]# cd nrpe-2.12 [root@server nrpe-2.12]# ./configure [root@server nrpe-2.12]# make all [root@server nrpe-2.12]# make install-plugin [root@server nrpe-2.12]# make install-daemon [root@server nrpe-2.12]# make install-daemon-config 2. 安装nagios 补丁 [root@server ~]# tar zxvf nagios-plugins-1.4.14.tar.gz [root@server ~]# cd nagios-plugins-1.4.14 [root@server nagios-plugins-1.4.14]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@server nagios-plugins-1.4.14]# make [root@server nagios-plugins-1.4.14]# make install3. 配置nrpe.cfg
[root@server libexec]# vim /usr/local/nagios/etc/nrpe.cfg allowed_hosts=127.0.0.1,192.168.30.100 # 添加监控主机的IP4. 启动nrpe守护进程
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 检查nrpe是否已经启动 [root@server ~]# netstat -nultp |grep 5666 tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 18929/nrpe5. 测试nrpe功能
[root@server ~]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 NRPE v2.12 正常的返回值为被监控服务器上安装的NRPE的版本信息,如果能看到这些,表示NRPE已经正常工作了6. 定义监控服务器内容
要监控一个远程服务器下的某些信息,首先要在远程服务器中定义监控的内容,例如,如果要监控一台远程服务器上的当前用户数、cpu负载、磁盘利用率、交换空间使用情况时,则需要在
nrpe.conf中定义监控内容:
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_sda5]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda5 command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20 -c 10其中,command后面中括号里面的内容就是定义的变量,变量名可以随意指定。
二.服务端上的配置
1. 安装nrpe
[root@server ~]# tar zxvf nrpe-2.12.tar.gz [root@server ~]# cd nrpe-2.12 [root@server nrpe-2.12]# ./configure [root@server nrpe-2.12]# make all [root@server nrpe-2.12]# make install-plugin [root@server nrpe-2.12]# make install-daemon [root@server nrpe-2.12]# make install-daemon-config检测nrpe是否能正常和客户端通信
[root@server ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.30.110CHECK_NRPE: Error - Could not complete SSL handshake. 注意:这里有一个报错。 解决办法: 1).检查是否安装了openssl和openssl-devel包 [root@server ~]# rpm -qa |grep ssl openssl-devel-1.0.0-20.el6_2.5.x86_64 openssl-1.0.0-20.el6_2.5.x86_642). 检查/usr/local/nagios/etc/nrpe.cfg 此配置文件是否配置正确
allowed_hosts=127.0.0.1,192.168.30.100若check_nrpe 的返回值是其版本号,则表示已正常通信
[root@server nrpe-2.12]# /usr/local/nagios/libexec/check_nrpe -H 192.168.30.110NRPE v2.122.定义一个check_nrpe监控命令
[root@server objects]# vim commands.cfg define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ }3.测试命令执行
在nrpe.cfg文件中定义的几条默认的配置可以直接使用,我们在使用前先测试一下,看看需不需对命令的参数进行一些调整,以符合我们实际情况:
在监控主机上运行: [root@server objects]# /usr/local/nagios/libexec/check_nrpe -H 192.168.30.110 -ccheck_users
USERS OK - 1 users currently logged in |users=1;5;10;04.编辑services.cfg
[root@server ~]# vim /usr/local/nagios/etc/services.cfg define service{ use generic-service host_name node-1 service_description check_users check_command check_nrpe!check_users max_check_attempts 5 normal_check_interval 1 } define service{ use generic-service host_name node-1 service_description check_load check_command check_nrpe!check_load max_check_attempts 5 normal_check_interval 1 } define service{ use generic-service host_name node-1 service_description check_sda check_command check_nrpe!check_sda max_check_attempts 5 normal_check_interval 1 } define service{ use generic-service host_name node-1 service_description check_zombie_procs check_command check_nrpe!check_zombie_procs max_check_attempts 5 normal_check_interval 1 } define service{ use generic-service host_name node-1 service_description check_total_procs check_command check_nrpe!check_total_procs max_check_attempts 5 normal_check_interval 1 } define service{ use generic-service host_name node-1 service_description check_swap check_command check_nrpe!check_swap max_check_attempts 5 normal_check_interval 1 }注意:这里的命令需要先在客户机的nrpe.cfg 上定义,并且要对应上!
5.重启nrpe和nagios服务
客户端: [root@client ~]# killall nrpe [root@client ~]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d服务器:
通过-v 检查是否存在错误,若没错误则重启nagios /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg [root@server ~]# service nagios restart Running configuration check...done. Stopping nagios: done. Starting nagios: done.
相关软件包下载:
转载地址:http://lyaxx.baihongyu.com/