Monitoring and alerting

Monitoring is one of the most important things for secure and flawlessly working service and/or system. Combined with load-balancing (scaling), backups, firewalling and encryption, monitoring can ensure stable setup with nearly constant uptime, can predict and can help prevent most failures and outages ahead of time.

Monitoring and alerting can be as trivial as observing server resources (system load, RAM, disk and CPU usage) and can often come out of the box with some cloud providers. Sometimes however a more granulated metrics are required such as:

I/O (disk) operations
network traffic and load
per process resource usage and uptime
per protocol or application custom/dedicated metrics
last activity or value for a metric
expiration or outage alerting
and so on

Properly setup monitoring and alerting policy and rules can help plan new solutions, solve issues ahead of time and with minimal or no downtime. Monitoring is essential for maintaining systems and services operational and keeping end clients happy.

Our expertise

The monitoring solution we can recommend depends on the technologies used and the implemented business logic. We have experience with trivial and complex systems; standard and custom cases and solutions like:

resource monitoring (CPU, RAM, I/O, network and so on)
system processes and services monitoring and alerting
backup success/failure monitoring
different monitoring solutions (Nagios, Icinga2, Prometheus, Alertmanager, Grafana and so on)
specific service or cases monitoring (i.e. EMQx, HAProxy, Let's Encrypt/TLS certificate expiration)
scheduled tasks execution success/failure monitoring
pending security and non-critical updates altering
single host and multiple copies (replicated) setups
and more

Do you need monitoring and alerting to be setup?

Read more about how we proceed with consultations.