Service · Monitoring

See it before users open a ticket. Audit it after.

Prometheus + Grafana, Loki, Wazuh, Zabbix — the open observability stack we run for ourselves and for clients. On-call rotation you can audit, status page customers can subscribe to, post-incident reports inside 5 working days.

Talk to a monitoring engineer

Who this is for

What's included

Metrics (Prometheus + Grafana)

Hosted in your cloud or ours. Dashboards per service, alerts per SLO, 13-month retention default.

Logs (Loki / Wazuh / Splunk)

Centralised, indexed, with retention by class. Audit-grade for the Regulated tier.

Synthetic + RUM

Black-box checks of every public endpoint; real-user monitoring where it's justified.

On-call rotation

Pager-rota integrated to PagerDuty / Opsgenie / Grafana OnCall. We can be the on-call team or share with yours.

Status page

Public statuspage.io / Cachet / our own — whatever your customers expect. Subscribe via RSS / email / SMS.

Postmortems

Every P1 gets a postmortem within 5 working days. Blameless, structured, action-tracked.

How we deliver

  1. Discovery — What do you currently watch? What did you find out about last from a customer?
  2. Design — SLO catalogue with targets, alerting policy, on-call rotation.
  3. Run — Stack live within 2 weeks. First on-call rotation in week 3.
  4. Optimise — Monthly SLO review. Postmortem actions tracked to closure.

Outcomes you can measure

99.95%Service availability target
<5 dPostmortem turnaround
13 moDefault metric retention

Tier-specific SLAs at /service-levels/.

Tech stack we run

Standard pieces; we'll work with what you have if you prefer.

Prometheus Grafana Loki Wazuh Zabbix Sentry PagerDuty Grafana OnCall

Ready to talk monitoring?

30-minute discovery call. No slide deck.

Book a consultation