Skip to main content

Observability & Monitoring

This document covers all logging, log rotation, and the real-time monitoring stack for Kloyst staging.


Logging Architecture​

Kloyst uses Winston with nest-winston integration. Every running service (API_GATEWAY, WEBHOOK, WORKER, OUTBOX) emits structured JSON logs.

Log Destinations​

Each service writes to two destinations simultaneously:

  1. Container stdout — captured by Docker, available in Dozzle and Grafana/Loki (via Promtail)
  2. Rotating log files — written to ./logs/ (mounted as host volume at /srv/apps/kloyst-core/logs/)

Log Files​

File PatternContentLevel
system-YYYY-MM-DD.logNestJS lifecycle events, auth events, service eventsinfo + warn
exception-YYYY-MM-DD.logAll unhandled errors + stack traceserror
network-YYYY-MM-DD.logEvery HTTP request/response (method, path, status, duration)all
debug-YYYY-MM-DD.logVerbose debug output — only written in NODE_ENV=developmentdebug

All files use gzip compression on rotation.


Log Rotation Configuration​

Controlled entirely by environment variables (no code change needed):

LOG_MAX_FILES=14d # Delete logs older than 14 days (staging/prod)
LOG_MAX_SIZE=20m # Rotate when a single file reaches 20 MB
EnvironmentLOG_MAX_FILESLOG_MAX_SIZE
Local dev3d5m
Staging14d20m
Production30d50m

Host-level logrotate (additional safety net)​

A logrotate config at /etc/logrotate.d/kloyst-staging runs nightly via cron to hard-delete any logs older than 30 days and compress uncompressed files.


Log Format (Structured JSON)​

Every log line is a JSON object:

{
"timestamp": "2026-05-22T10:00:00.000Z",
"level": "info",
"context": "AuthService",
"message": "User logged in successfully",
"userId": "uuid-here",
"vendorId": "uuid-here"
}

Monitoring Stack​

The monitoring stack runs on the same server, managed by a separate Docker Compose file: monitoring/docker-compose.monitoring.yml.

Services​

ServiceToolURLPurpose
Log AggregationGrafana Loki 2.9.8internal :3100Stores and indexes all logs
Log CollectionPromtail 2.9.8internalTails Docker + file logs → ships to Loki
DashboardsGrafana 10.4.3https://logs.staging.kloyst.comQuery, stream, and alert on logs
Live TailDozzlehttps://dozzle.staging.kloyst.comReal-time Docker container log viewer

Log Sources Promtail Collects​

SourceLabelsNotes
Docker containers (all)container, service, projectAuto-discovered via Docker socket
kloyst-core/logs/system-*.loglog_type=systemWinston system events
kloyst-core/logs/exception-*.loglog_type=exception, level=errorErrors + stack traces
kloyst-core/logs/network-*.loglog_type=networkHTTP request logs
/var/log/nginx/*access*.logjob=nginx, log_type=accessNginx request logs
/var/log/nginx/*error*.logjob=nginx, log_type=errorNginx error logs

Log Retention​

Loki is configured to auto-delete logs older than 30 days (retention_period: 720h in loki-config.yml).


Grafana Usage​

First Login​

  1. Go to https://logs.staging.kloyst.com
  2. Login: admin / <GRAFANA_PASSWORD from .env>
  3. Loki is pre-wired as the default datasource (auto-provisioned at startup)
  4. Navigate to Explore → Select Loki → start querying

Key LogQL Queries​

# All kloyst staging logs (real-time)
{project="kloyst-staging"}

# API Gateway logs only
{container="kloyst-staging-api-gateway"}

# All error-level logs
{project="kloyst-staging", level="error"}

# Exception file logs only
{log_type="exception"}

# HTTP network logs
{log_type="network"}

# Logs from a specific NestJS context (e.g., CampaignService)
{project="kloyst-staging"} |= "CampaignService"

# Nginx 5xx errors
{job="nginx", log_type="access"} |~ "\" 5[0-9]{2} "

# Error rate per minute
rate({project="kloyst-staging", level="error"}[1m])

Dozzle Usage​

Best for: Live debugging during a deployment, or when you want to see logs from a specific container without writing LogQL.

  1. Go to https://dozzle.staging.kloyst.com
  2. Login: admin / <DOZZLE_PASSWORD from .env>
  3. Click any container → live tail starts immediately
  4. Use the Group view to watch all kloyst-staging-* containers simultaneously

Deploying the Monitoring Stack​

# First time
cd /srv/apps/kloyst-core/monitoring
cp .env.example .env
nano .env # Set GRAFANA_PASSWORD and DOZZLE_PASSWORD

# Get SSL certs for monitoring domains
sudo certbot certonly --nginx \
-d logs.staging.kloyst.com \
-d dozzle.staging.kloyst.com

# Symlink Nginx config
sudo ln -sf /srv/apps/kloyst-core/monitoring/nginx/monitoring.conf \
/etc/nginx/auto-sites/kloyst-monitoring.conf
sudo nginx -t && sudo systemctl reload nginx

# Start monitoring
docker compose -f docker-compose.monitoring.yml up -d

# Updates
docker compose -f docker-compose.monitoring.yml pull
docker compose -f docker-compose.monitoring.yml up -d
docker image prune -f

Checking Monitoring Health​

# Loki ready?
curl http://localhost:3100/ready

# Promtail sending logs?
curl http://localhost:9080/metrics | grep promtail_sent_entries_total

# All containers running?
docker compose -f docker-compose.monitoring.yml ps