icinga interview questions
Top icinga frequently asked interview questions
I need help understanding an error why I'm seeing an error.
The feature api
is already enabled with the correct ApiListener object, and Api logs are being updated in /var/lib/icinga2/api/log/current
.
But I'm getting this error when I restart icinga2:
Error: Error while evaluating expression: The type 'ApiUser' is unknown: in /etc/icinga2/conf.d/api-users.conf: 1:0-1:20
I'm running version r2.3.10-1 of Icinga2 on Ubuntu.
Can someone explain what the problem is?
Source: (StackOverflow)
I am getting this error in the Icinga startup -
/pkgs/icinga/1.13.3.rhas5/bin/icinga /my-config/dit-icinga-app-master/config/icinga.cfg
Icinga 1.13.3
Copyright (c) 2009-2015 Icinga Development Team (http://www.icinga.org)
Copyright (c) 2009-2013 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 07-15-2015
License: GPL
Icinga 1.13.3 starting... (PID=30807)
Local time is Fri Feb 05 16:45:17 EST 2016
idomod: IDOMOD 1.13.3 (07-15-2015) Copyright(c) 2005-2008 Ethan Galstad, Copyright(c) 2009-2015 Icinga Development Team (https://www.icinga.org)
idomod: Successfully connected to data sink. 0 queued items to flush.
Memory fault(coredump)
Log file does not reveal much
icinga.log
[1454708717] Icinga 1.13.3 starting... (PID=30807)
[1454708717] Local time is Fri Feb 05 16:45:17 EST 2016
[1454708717] LOG VERSION: 2.0
[1454708717] idomod: IDOMOD 1.13.3 (07-15-2015) Copyright(c) 2005-2008 Ethan Galstad, Copyright(c) 2009-2015 Icinga Development Team (https://www.icinga.org)
[1454708717] idomod: Successfully connected to data sink. 0 queued items to flush.
[1454708717] Event broker module 'IDOMOD' version '1.13.3' from '/pkgs/icinga_idoutils_libdbi_mysql/1.13.3/lib64/icinga/idomod.so' initialized successfully.
[1454708718] Event loop started...
icinga.log (END)
Icinga pre-flight check looks OK, so there is no issue with any of the configuration files.
Also, MySQL database is running on the node and I do see some data being inserted to it also in ido2db.debug logs.
Wondering where can I get more logs? Anybody has any leads? Appreciate your help.
Source: (StackOverflow)
I have nagios/icinga object file with similar contents below but lots of it. How do i extract the similar objects for example only the "host objects" or "service objects" using bash or python.
define host{ ## extract including the "define host {....}"
use generic-switch
host_name bras ;
alias bras-gw.example.com;
address 20.94.66.88
hostgroups bgp;
}
define host{ ## extract including the "define host {....} define host {....} "
use generic-switch
host_name ar1 ;
alias ar1.example.com;
address 22.98.66.244
hostgroups bgp;
}
define servicegroup {
servicegroup_name Premium
alias Premium-BGP
}
define service {
host_name ar0
service_description Get-Speed- BGP-INTL dsdf34
check_command check_bgp!secreat!10.10.40.44
check_interval 1
use generic-service
notification_interval 0 ; set > 0 if you want to be re-notified
}
define service {
host_name ar10
service_description Get-Speed- BGP-INTL rrdf34
check_command check_bgp!secreat!10.10.40.77
check_interval 1
use generic-service
notification_interval 0 ; set > 0 if you want to be re-notified
check_period 24x7
notification_period 24x7
contact_groups p2p,l2,system2,admins
use generic-service
max_check_attempts 3
notification_options c,r
}
Target is to extract specific host, or service objects from the file eg.
define host{
use generic-switch
host_name ar0 ;
alias id6.example.net;
address 20.24.6.22
hostgroups bgp;
}
define host{
use generic-switch
host_name bras ;
alias bras-gw.abc.com.dp;
address 202.33.66.254
hostgroups bgp;
}
define host{
use generic-switch
host_name ar1 ;
alias ar1.abc.com;
address 20.94.66.44
hostgroups bgp;
}
Ans: sed -nr '/.*(\bhost\b|\bservice\b).*\{/,/\}/ p' datafile
as provided by @ritesht93
Source: (StackOverflow)
Here I am trying to use a plugin to check whether the service running or not, if there is any warning or any critical action required, at the same time the performance parameter.
We have used below plugin to check if a server is alive or not and read it's performance data JSON
https://github.com/drewkerrigan/nagios-http-json
I am trying to read a JSON file as below which is hosted on http://localhost:8080/sample.json
The plugin works perfectly on Command line, it shows me all the Metrics available.
$:/usr/lib/nagios/plugins$ ./check_http_json.py -H localhost:8080 -p sample.json -m metrics.etp_count metrics.atc_count
OK: Status OK.|'metrics.etp_count'=101 'metrics.atc_count'=0
But when I try the same in Icinga2 configuration, it doesn't show me this performance metrics, although it doesn't give any error but at the same time it don't show any value.
find the JSON, Command.conf and Service.conf as follows.
{
"metrics": {
"etp_count": "0",
"atc_count": "101",
"mean_time": -1.0,
}
}
Below are my commands.conf and services.conf
commands.conf
/* Json Read Command */
object CheckCommand "json_check"{
import "plugin-check-command"
command = [PluginDir + "/check_http_json.py"]
arguments = {
"-H" = "$server_port$"
"-p" = "$json_path$"
"-w" = "$warning_value$"
"-c" = "$critical_value$"
"-m" = "$Metrics1$,$Metrics2$"
}
}
services.conf
apply Service "json"{
import "generic-service"
check_command = "json_check"
vars.server_port="localhost:8080"
vars.json_path="sample.json"
vars.warning_value="metrics.etp_count,1:100"
vars.critical_value="metrics.etp_count,101:1000"
vars.Metrics1="metrics.etp_count"
vars.Metrics2="metrics.atc_count"
assign where host.name == NodeName
}
Does any one have any idea how can we pass multiple values in Command.conf and Service.conf??
Source: (StackOverflow)
I'm trying to use icinga2's puppet module which defines a custom function and a template where it is used. I'm using the following (stripped) hiera configuration:
icinga2::object::host:
host.com:
target_file_name: host.conf
display_name: host.com
ipv4_address: XXX
vars:
os: Linux
The template without vars
renders completely fine but when it is included, puppet fails to evaluate the function call in the template:
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Evaluation Error: Error while evaluating a Function Call, Failed to parse template icinga2/object_host.conf.erb:
Filepath: org/jruby/RubyKernel.java
Line: 1072
Detail: Could not autoload puppet/parser/functions/icinga2_config_value: no such file to load -- puppet/icinga2/utils
at /etc/puppetlabs/code/environments/production/modules/icinga2/manifests/object/host.pp:71:18 on node XXX
Also puppet finds and executes the command just fine when called directly in an inline template:
root@puppetmaster:~# /opt/puppetlabs/bin/puppet apply -e "notice(inline_template(\"<%= scope.function_icinga2_config_value([[1,2]]) %>\"))"
Notice: Scope(Class[main]): [
"1",
"2",
]
I've also found some bugs (1, 2) that go into a similar direction but they where fixed years ago and the suggested workaround also do not work. I'm using the very recent version of 4.2.1.
Any idea how to further debug this issue or to fix it in the icinga2 module?
Source: (StackOverflow)
I have installed iCinga
and nrpe
in same machine. I am using nrpe
for monitor many linux machine, so I installed nrpe locally also.
When I start my nrep locally service nrpe start
it sows error like in /var/log/messages
nrpe : Network server bind failure (98: Address already in use)
I have google it that issue, and find the 5666 port usage
[root@cosrh6-74 conf.d]# netstat -apn | grep :5666
tcp 0 0 127.0.0.1:50539 10.104.16.212:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:50608 10.104.16.212:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:41987 10.104.16.210:5666 TIME_WAIT -
tcp 0 1 127.0.0.1:42001 10.104.16.210:5666 SYN_SENT -
tcp 0 0 127.0.0.1:50576 10.104.16.212:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:41927 10.104.16.210:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:52598 10.3.81.172:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:52624 10.3.81.172:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:41962 10.104.16.210:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:41979 10.104.16.210:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:52566 10.3.81.172:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:41928 10.104.16.210:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:52569 10.3.81.172:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:41955 10.104.16.210:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:52587 10.3.81.172:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:50586 10.104.16.212:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:50547 10.104.16.212:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:52588 10.3.81.172:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:50609 10.104.16.212:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:50567 10.104.16.212:5666 TIME_WAIT -
tcp 0 0 127.0.0.1:52592 10.3.81.172:5666 TIME_WAIT -
tcp 0 0 :::5666 :::* LISTEN 757/xinetd
I I have changed /etc/nagios/nrpe.cfg
port to 56666 from 5666.
How can I configure different port in host configuration(different port for different host) in icinga2 server to monitor machines with nrpe running in different ports?
Is this right to change port? Or any other way to do this? Please correct me if I did anything wrong?
Source: (StackOverflow)
I am a beginner in icinga and nagios usage for server management. I setup icinga on a machine and set up all the basics. The next step I tried was to check if certain services were running at ports 8080, 8081 and 8082
or not. I wrote a quick python script for that. I placed that file under /usr/local/lib/myscript.py
. The next step I did was to create a command under /etc/nagios-plugins/config/testone.cfg
. My command looks like this
define command{
command_name check_restarts
command_line python /usr/local/lib/myscript.py -w 3 -c 5 -p 8080
command_line python /usr/local/lib/myscript.py -w 3 -c 5 -p 8081
command_line python /usr/local/lib/myscript.py -w 3 -c 5 -p 8082
}
I then added a service to services.conf under /etc/icinga2/conf.d/services.conf
. But this leads to an error when I restart icinga which shows up a message Backend icinga not running
on the UI and errors point to services.conf
when I try sudo service icings2 status
.
Can anyone please guid me around these steps?
Source: (StackOverflow)
I'm using Icinga2
with NSClient++
I have a PowerShell
check for certain cluster roles which is installed on every cluster node
.
Should a cluster role fail, all cluster nodes
would send out identical notifications which will result in a lot of spam for just one actual service problem.
Only installing the check on one cluster node is no option as it would produce a single point of failure for role monitoring: A failing cluster node should not affect the cluster roles (aside from a short timeout) but I would not be able to check any cluster role as soon as it's down.
Is it possible to assign a service
to a hostgroup
in a way that only one notification will be sent if this service fails?
Source: (StackOverflow)
Icinga is already set up and active checks running with in aws ec2.
But, now there is need to monitor dozen of nodes (non ec2) that are behind proxy and exposing single public IP. All nodes are running with Ubuntu.
I learnt that passive checks is the solution. Tried to setup it using NSCA-ng instead of NSCA as suggested by Icinga docs.
Found one write-up online and followed it. But, was stuck as test check is not responding.
However being novice, step by step guide to set up NSCA-ng with example is what I'm missing.
Any pointer in this regard is appreciated.
Source: (StackOverflow)
I'm using check_vmware_esx.pl to check Vcenter : https://github.com/BaldMansMojo/check_vmware_esx
This plugin could check every vm on the Vcenter if a list of vms is given in host declaration :
object Host "scdvh2" {
address = ".."
import ".."
vars.vmware_vmname = [ "vm1","vm2","vm3"]
}
Service is :
apply Service "soap-vm-io" for (vmname in host.vars.vmware_vmname) {
import "esx-service"
display_name = vmname
check_command = "vmware-esx-soap-vm-io"
vars.vwmare_vmame = vmname
assign where "esx" in host.groups
}
It would be useful if the list vars.vmware_vmname were generated before each "soap-vm-io".
I don't know how to begin to do that and stock the result in a macro or a python shelve.
Thanks for you help.
Source: (StackOverflow)
I am trying to execute nrpe plugin from my iCinga server like this
/usr/local/nagios/libexec/check_nrpe -H <host> -c \
'nrpe_check_traffic_status' -a '2' '3' -p <port>
I have made some print in plugin this is the result
>>opt>> -w >> arg 2
>>opt>> -c >> arg -p ### THIS LINE IS ERROR ###
Threshold values should be numerical
It is not executed properly, It sends -p
as second argument instead of 3
to remote nrpe
But same working when I give like this
/usr/local/nagios/libexec/check_nrpe -H <host> -c \
'nrpe_check_traffic_status' -p <port>-a '2' '3'
Result
>>opt>> -w >> arg 2
>>opt>> -c >> arg 3
TRAFFIC STATUS OK;
Did anyone faced this issue? Is there any solution for this?
Or is there any way to change this argument position in iCinga2 configuration?
Note: I have tried changing argument parameter up/down in commands.conf
file, no use.
Source: (StackOverflow)
Just installed Icinga2 2.3.2. Trying to find a way to set different threshold on process for each host. Any help?
Source: (StackOverflow)
I need to monitor a CoreOS cluster which used to host a kubernetes cluster on top of that. I use heapster to monitor kuberenetes cluster.
Now I need to monitor CoreOS minions using icinga/nagios. Is there any way to do so?
Thanks
Source: (StackOverflow)
Hi I am trying to authenticate icinga(centos) using AD in the same VPC.But failed in every attempt. Is it not possible to use AD authentication for ICINGA? Please help.
Source: (StackOverflow)
friends!
I have as frontend Icinfa 1.9 and as background Centreon 2.1.13 It uses jointly icinga.cfg and other icinga .cfg files. This system is managed via Centreon as Nagios Configuration Files Export. So, I have a question: is that possible of migrating to icinga2 platform with export whole data?
What should be my first steps?
Source: (StackOverflow)