EzDevInfo.com

icinga interview questions

Top icinga frequently asked interview questions

Icinga2 object ApiUser is unknown

I need help understanding an error why I'm seeing an error. The feature api is already enabled with the correct ApiListener object, and Api logs are being updated in /var/lib/icinga2/api/log/current .

But I'm getting this error when I restart icinga2:

Error: Error while evaluating expression: The type 'ApiUser' is unknown: in /etc/icinga2/conf.d/api-users.conf: 1:0-1:20

I'm running version r2.3.10-1 of Icinga2 on Ubuntu.

Can someone explain what the problem is?

Source: (StackOverflow)

Memory fault(coredump) during Icinga (1.x) startup

I am getting this error in the Icinga startup -

/pkgs/icinga/1.13.3.rhas5/bin/icinga  /my-config/dit-icinga-app-master/config/icinga.cfg

Icinga 1.13.3
Copyright (c) 2009-2015 Icinga Development Team (http://www.icinga.org)
Copyright (c) 2009-2013 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 07-15-2015
License: GPL

Icinga 1.13.3 starting... (PID=30807)
Local time is Fri Feb 05 16:45:17 EST 2016
idomod: IDOMOD 1.13.3 (07-15-2015) Copyright(c) 2005-2008 Ethan Galstad, Copyright(c) 2009-2015 Icinga Development Team (https://www.icinga.org)
idomod: Successfully connected to data sink.  0 queued items to flush.
Memory fault(coredump)

Log file does not reveal much icinga.log

[1454708717] Icinga 1.13.3 starting... (PID=30807)
[1454708717] Local time is Fri Feb 05 16:45:17 EST 2016
[1454708717] LOG VERSION: 2.0
[1454708717] idomod: IDOMOD 1.13.3 (07-15-2015) Copyright(c) 2005-2008 Ethan Galstad, Copyright(c) 2009-2015 Icinga Development Team (https://www.icinga.org)
[1454708717] idomod: Successfully connected to data sink.  0 queued items to flush.
[1454708717] Event broker module 'IDOMOD' version '1.13.3' from '/pkgs/icinga_idoutils_libdbi_mysql/1.13.3/lib64/icinga/idomod.so' initialized successfully.
[1454708718] Event loop started...
icinga.log (END)

Icinga pre-flight check looks OK, so there is no issue with any of the configuration files.

Also, MySQL database is running on the node and I do see some data being inserted to it also in ido2db.debug logs.

Wondering where can I get more logs? Anybody has any leads? Appreciate your help.

Source: (StackOverflow)

how to extract string between curly braces from file

I have nagios/icinga object file with similar contents below but lots of it. How do i extract the similar objects for example only the "host objects" or "service objects" using bash or python.

define host{          ## extract including the "define host {....}"
    use             generic-switch
    host_name       bras ;
    alias           bras-gw.example.com;
    address         20.94.66.88
    hostgroups      bgp;
}

define host{    ## extract including the "define host {....} define host {....} "
    use             generic-switch
    host_name       ar1 ;
    alias           ar1.example.com;
    address         22.98.66.244
    hostgroups      bgp;
}

define servicegroup {
   servicegroup_name Premium
   alias Premium-BGP
}
define service {
   host_name                ar0
   service_description      Get-Speed- BGP-INTL dsdf34
   check_command            check_bgp!secreat!10.10.40.44
   check_interval           1
   use                      generic-service
   notification_interval    0 ; set > 0 if you want to be re-notified
}

 define service {
   host_name                ar10
   service_description      Get-Speed- BGP-INTL rrdf34
   check_command            check_bgp!secreat!10.10.40.77
   check_interval           1
   use                      generic-service
   notification_interval    0 ; set > 0 if you want to be re-notified
   check_period                          24x7
           notification_period                   24x7
           contact_groups                        p2p,l2,system2,admins
           use                                   generic-service
           max_check_attempts      3
           notification_options    c,r
}

Target is to extract specific host, or service objects from the file eg.

    define host{
    use             generic-switch
    host_name       ar0 ;
    alias           id6.example.net;
    address         20.24.6.22
    hostgroups      bgp;
}
define host{
    use             generic-switch
    host_name       bras ;
    alias           bras-gw.abc.com.dp;
    address         202.33.66.254
    hostgroups      bgp;
}
define host{
    use             generic-switch
    host_name       ar1 ;
    alias           ar1.abc.com;
    address         20.94.66.44
    hostgroups      bgp;
    }

Ans: sed -nr '/.*(\bhost\b|\bservice\b).*\{/,/\}/ p' datafile as provided by @ritesht93

Source: (StackOverflow)

How can we provide multiple values for a Single argument either in services.conf or comands.conf

Here I am trying to use a plugin to check whether the service running or not, if there is any warning or any critical action required, at the same time the performance parameter.

We have used below plugin to check if a server is alive or not and read it's performance data JSON https://github.com/drewkerrigan/nagios-http-json

I am trying to read a JSON file as below which is hosted on http://localhost:8080/sample.json

The plugin works perfectly on Command line, it shows me all the Metrics available.

$:/usr/lib/nagios/plugins$ ./check_http_json.py -H localhost:8080 -p sample.json -m metrics.etp_count metrics.atc_count

OK: Status OK.|'metrics.etp_count'=101 'metrics.atc_count'=0

But when I try the same in Icinga2 configuration, it doesn't show me this performance metrics, although it doesn't give any error but at the same time it don't show any value.

find the JSON, Command.conf and Service.conf as follows.

{ 
 "metrics": {
    "etp_count": "0",
    "atc_count": "101",
    "mean_time": -1.0,
  }
}

Below are my commands.conf and services.conf

commands.conf

   /* Json Read Command */
object CheckCommand "json_check"{
import "plugin-check-command"
command = [PluginDir + "/check_http_json.py"]
arguments = {
"-H" = "$server_port$"
"-p" = "$json_path$"
"-w" = "$warning_value$"
"-c" = "$critical_value$"
"-m" = "$Metrics1$,$Metrics2$"
}
}

services.conf

apply Service "json"{
        import "generic-service"

        check_command = "json_check"
        vars.server_port="localhost:8080"
        vars.json_path="sample.json"
        vars.warning_value="metrics.etp_count,1:100"
        vars.critical_value="metrics.etp_count,101:1000"
        vars.Metrics1="metrics.etp_count"
        vars.Metrics2="metrics.atc_count"

        assign where host.name == NodeName
}

Does any one have any idea how can we pass multiple values in Command.conf and Service.conf??

Source: (StackOverflow)

Puppet template does not find module function

I'm trying to use icinga2's puppet module which defines a custom function and a template where it is used. I'm using the following (stripped) hiera configuration:

icinga2::object::host:
  host.com:
    target_file_name: host.conf
    display_name: host.com
    ipv4_address: XXX
    vars:
      os: Linux

The template without vars renders completely fine but when it is included, puppet fails to evaluate the function call in the template:

Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Evaluation Error: Error while evaluating a Function Call, Failed to parse template icinga2/object_host.conf.erb:
  Filepath: org/jruby/RubyKernel.java
  Line: 1072
  Detail: Could not autoload puppet/parser/functions/icinga2_config_value: no such file to load -- puppet/icinga2/utils
 at /etc/puppetlabs/code/environments/production/modules/icinga2/manifests/object/host.pp:71:18 on node XXX

Also puppet finds and executes the command just fine when called directly in an inline template:

root@puppetmaster:~# /opt/puppetlabs/bin/puppet apply -e "notice(inline_template(\"<%= scope.function_icinga2_config_value([[1,2]]) %>\"))"
Notice: Scope(Class[main]): [
    "1",
    "2",
  ]

I've also found some bugs (1, 2) that go into a similar direction but they where fixed years ago and the suggested workaround also do not work. I'm using the very recent version of 4.2.1.

Any idea how to further debug this issue or to fix it in the icinga2 module?

Source: (StackOverflow)

nrpe : Network server bind failure (98: Address already in use)

I have installed iCinga and nrpe in same machine. I am using nrpe for monitor many linux machine, so I installed nrpe locally also.

When I start my nrep locally service nrpe start it sows error like in /var/log/messages

nrpe : Network server bind failure (98: Address already in use)

I have google it that issue, and find the 5666 port usage

[root@cosrh6-74 conf.d]# netstat -apn | grep :5666
tcp        0      0 127.0.0.1:50539           10.104.16.212:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:50608           10.104.16.212:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:41987           10.104.16.210:5666          TIME_WAIT   -
tcp        0      1 127.0.0.1:42001           10.104.16.210:5666          SYN_SENT    -
tcp        0      0 127.0.0.1:50576           10.104.16.212:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:41927           10.104.16.210:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:52598           10.3.81.172:5666            TIME_WAIT   -
tcp        0      0 127.0.0.1:52624           10.3.81.172:5666            TIME_WAIT   -
tcp        0      0 127.0.0.1:41962           10.104.16.210:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:41979           10.104.16.210:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:52566           10.3.81.172:5666            TIME_WAIT   -
tcp        0      0 127.0.0.1:41928           10.104.16.210:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:52569           10.3.81.172:5666            TIME_WAIT   -
tcp        0      0 127.0.0.1:41955           10.104.16.210:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:52587           10.3.81.172:5666            TIME_WAIT   -
tcp        0      0 127.0.0.1:50586           10.104.16.212:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:50547           10.104.16.212:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:52588           10.3.81.172:5666            TIME_WAIT   -
tcp        0      0 127.0.0.1:50609           10.104.16.212:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:50567           10.104.16.212:5666          TIME_WAIT   -
tcp        0      0 127.0.0.1:52592           10.3.81.172:5666            TIME_WAIT   -
tcp        0      0 :::5666                     :::*                        LISTEN      757/xinetd

I I have changed /etc/nagios/nrpe.cfg port to 56666 from 5666.

How can I configure different port in host configuration(different port for different host) in icinga2 server to monitor machines with nrpe running in different ports?

Is this right to change port? Or any other way to do this? Please correct me if I did anything wrong?

Source: (StackOverflow)

Custom plugin check icinga2

I am a beginner in icinga and nagios usage for server management. I setup icinga on a machine and set up all the basics. The next step I tried was to check if certain services were running at ports 8080, 8081 and 8082 or not. I wrote a quick python script for that. I placed that file under /usr/local/lib/myscript.py. The next step I did was to create a command under /etc/nagios-plugins/config/testone.cfg . My command looks like this

define command{
        command_name    check_restarts
        command_line    python /usr/local/lib/myscript.py -w 3 -c 5 -p 8080
        command_line    python /usr/local/lib/myscript.py -w 3 -c 5 -p 8081
        command_line    python /usr/local/lib/myscript.py -w 3 -c 5 -p 8082
        }

I then added a service to services.conf under /etc/icinga2/conf.d/services.conf. But this leads to an error when I restart icinga which shows up a message Backend icinga not running on the UI and errors point to services.conf when I try sudo service icings2 status.

Can anyone please guid me around these steps?

Source: (StackOverflow)

Monitor Failovercluster roles with Icinga2

I'm using Icinga2 with NSClient++

I have a PowerShell check for certain cluster roles which is installed on every cluster node. Should a cluster role fail, all cluster nodes would send out identical notifications which will result in a lot of spam for just one actual service problem.

Only installing the check on one cluster node is no option as it would produce a single point of failure for role monitoring: A failing cluster node should not affect the cluster roles (aside from a short timeout) but I would not be able to check any cluster role as soon as it's down.

Is it possible to assign a service to a hostgroup in a way that only one notification will be sent if this service fails?

Source: (StackOverflow)

Icinga passive check setup

Icinga is already set up and active checks running with in aws ec2.

But, now there is need to monitor dozen of nodes (non ec2) that are behind proxy and exposing single public IP. All nodes are running with Ubuntu.

I learnt that passive checks is the solution. Tried to setup it using NSCA-ng instead of NSCA as suggested by Icinga docs.

Found one write-up online and followed it. But, was stuck as test check is not responding.

However being novice, step by step guide to set up NSCA-ng with example is what I'm missing.

Any pointer in this regard is appreciated.

Source: (StackOverflow)

dynamically list vm while checking vmware vCenter with Icinga2

I'm using check_vmware_esx.pl to check Vcenter : https://github.com/BaldMansMojo/check_vmware_esx

This plugin could check every vm on the Vcenter if a list of vms is given in host declaration :

object Host "scdvh2" { address = ".." import ".." vars.vmware_vmname = [ "vm1","vm2","vm3"] }

Service is :

apply Service "soap-vm-io" for (vmname in host.vars.vmware_vmname) { import "esx-service" display_name = vmname check_command = "vmware-esx-soap-vm-io" vars.vwmare_vmame = vmname assign where "esx" in host.groups } It would be useful if the list vars.vmware_vmname were generated before each "soap-vm-io". I don't know how to begin to do that and stock the result in a macro or a python shelve.

Thanks for you help.

Source: (StackOverflow)

executing check_nrpe from iCinga2

I am trying to execute nrpe plugin from my iCinga server like this

/usr/local/nagios/libexec/check_nrpe -H <host> -c \
'nrpe_check_traffic_status' -a '2' '3'  -p <port>

I have made some print in plugin this is the result

>>opt>> -w  >> arg 2
>>opt>> -c  >> arg -p                   ### THIS LINE IS ERROR ###
Threshold values should be numerical

It is not executed properly, It sends -p as second argument instead of 3 to remote nrpe

But same working when I give like this

/usr/local/nagios/libexec/check_nrpe -H <host> -c \
'nrpe_check_traffic_status' -p <port>-a '2' '3'

Result

>>opt>> -w  >> arg 2
>>opt>> -c  >> arg 3
TRAFFIC STATUS OK;

Did anyone faced this issue? Is there any solution for this? Or is there any way to change this argument position in iCinga2 configuration?

Note: I have tried changing argument parameter up/down in commands.conf file, no use.

Source: (StackOverflow)

Custom threshold for each host on Icinga2

Just installed Icinga2 2.3.2. Trying to find a way to set different threshold on process for each host. Any help?

Source: (StackOverflow)

Monitor CoreOS with icinga

I need to monitor a CoreOS cluster which used to host a kubernetes cluster on top of that. I use heapster to monitor kuberenetes cluster.

Now I need to monitor CoreOS minions using icinga/nagios. Is there any way to do so?

Thanks

Source: (StackOverflow)

AD authentication for ICINGA

Hi I am trying to authenticate icinga(centos) using AD in the same VPC.But failed in every attempt. Is it not possible to use AD authentication for ICINGA? Please help.

Source: (StackOverflow)

Migration [front]icinga1.x\[back]Centreon to icinga2

friends! I have as frontend Icinfa 1.9 and as background Centreon 2.1.13 It uses jointly icinga.cfg and other icinga .cfg files. This system is managed via Centreon as Nagios Configuration Files Export. So, I have a question: is that possible of migrating to icinga2 platform with export whole data? What should be my first steps?

Source: (StackOverflow)