My problem with zabbix docker container monitoring

Max Großmann
2 min readMay 28, 2023

--

Problem

The default zabbix template for container monitoring has several problems:

  • Offline containers are not discovered any more and don’t notify an error (e.g. containers that have no restart policy configured in docker compose) ⇒ Errors do not appear where you would expect them
  • While setting up new containers, zabbix already notifies errors that can be confusing for other team members ⇒ Errors appear where you don’t want them

Example

On the host there are currently 2 containers: mysql-prod and mysql-dev:

We assume that all containers are deployed via docker compose and the mysql-prod has no restart policy configured.

  • The mysql-prod container is mandatory, because it is used by other containers in production. After restart of the host server the container don’t start automatically (due to missing restart policy in docker-compose.yml). We would assume that zabbix will send a error notification so that we can fix the problem immediately. But because the container is not discovered anymore, in some cases no error notification will be triggered
  • The mysql-dev container is unimportant for monitoring purposes. Imagine we want to test a new version of the mysql image which causes errors on container start. This will lead into a error notification by zabbix what might be confusing for admins, because it’s only dev.

Next we assume that we want to test a new service (e.g. grafana). We start the container, configure something wrong and immediately trigger a error notification which is not a desired behaviour for initial setup. The notification might be eventually okay when we transition from setup to production phase and depend on the availability of the container.

The conceptional solution

We make use of the macros of the default docker monitoring template:

  • {$DOCKER.LLD.FILTER.CONTAINER.MATCHES} => Add names of all containers that are important for production and should trigger an error, when down.
  • {$DOCKER.LLD.FILTER.CONTAINER.NOT_MATCHES} ⇒ Add all names to dev-containers here that you don’t want to monitor

Create a item that makes sure all production containers are running:

  1. Scan all containers of macro {$DOCKER.LLD.FILTER.CONTAINER.MATCHES}
  2. Check if the container is contained in docker ps
  3. Create trigger that alerts an error if there is one required container that is not running

Create a discovery for undecided containers.

  1. Scan for running container names with docker ps
  2. Subtract names of {$DOCKER.LLD.FILTER.CONTAINER.MATCHES} and {$DOCKER.LLD.FILTER.CONTAINER.NOT_MATCHES}
  3. Create a warning trigger for every undecided container

Why trigger for every undecided container and not only one trigger?

Each trigger can be suppressed separately, until we decide wether it will become a production container.

Implementation in Zabbix

I didn’t implemented my solution yet. I will eventually add the implementation later on. Feel free to send screenshots to my me (e.g. medium@maax.gr) if you have implemented my solution and i should add the screenshots here.

Have a nice day!

Max

--

--

Max Großmann
Max Großmann

Written by Max Großmann

Software Development, Linux Administrator

No responses yet