Observability & Monitoring
This document covers all logging, log rotation, and the real-time monitoring stack for Kloyst staging.
Logging Architecture​
Kloyst uses Winston with nest-winston integration. Every running service (API_GATEWAY, WEBHOOK, WORKER, OUTBOX) emits structured JSON logs.
Log Destinations​
Each service writes to two destinations simultaneously:
- Container stdout — captured by Docker, available in Dozzle and Grafana/Loki (via Promtail)
- Rotating log files — written to
./logs/(mounted as host volume at/srv/apps/kloyst-core/logs/)
Log Files​
| File Pattern | Content | Level |
|---|---|---|
system-YYYY-MM-DD.log | NestJS lifecycle events, auth events, service events | info + warn |
exception-YYYY-MM-DD.log | All unhandled errors + stack traces | error |
network-YYYY-MM-DD.log | Every HTTP request/response (method, path, status, duration) | all |
debug-YYYY-MM-DD.log | Verbose debug output — only written in NODE_ENV=development | debug |
All files use gzip compression on rotation.
Log Rotation Configuration​
Controlled entirely by environment variables (no code change needed):
LOG_MAX_FILES=14d # Delete logs older than 14 days (staging/prod)
LOG_MAX_SIZE=20m # Rotate when a single file reaches 20 MB
Recommended values by environment:​
| Environment | LOG_MAX_FILES | LOG_MAX_SIZE |
|---|---|---|
| Local dev | 3d | 5m |
| Staging | 14d | 20m |
| Production | 30d | 50m |
Host-level logrotate (additional safety net)​
A logrotate config at /etc/logrotate.d/kloyst-staging runs nightly via cron to hard-delete any logs older than 30 days and compress uncompressed files.
Log Format (Structured JSON)​
Every log line is a JSON object:
{
"timestamp": "2026-05-22T10:00:00.000Z",
"level": "info",
"context": "AuthService",
"message": "User logged in successfully",
"userId": "uuid-here",
"vendorId": "uuid-here"
}
Monitoring Stack​
The monitoring stack runs on the same server, managed by a separate Docker Compose file: monitoring/docker-compose.monitoring.yml.
Services​
| Service | Tool | URL | Purpose |
|---|---|---|---|
| Log Aggregation | Grafana Loki 2.9.8 | internal :3100 | Stores and indexes all logs |
| Log Collection | Promtail 2.9.8 | internal | Tails Docker + file logs → ships to Loki |
| Dashboards | Grafana 10.4.3 | https://logs.staging.kloyst.com | Query, stream, and alert on logs |
| Live Tail | Dozzle | https://dozzle.staging.kloyst.com | Real-time Docker container log viewer |
Log Sources Promtail Collects​
| Source | Labels | Notes |
|---|---|---|
| Docker containers (all) | container, service, project | Auto-discovered via Docker socket |
kloyst-core/logs/system-*.log | log_type=system | Winston system events |
kloyst-core/logs/exception-*.log | log_type=exception, level=error | Errors + stack traces |
kloyst-core/logs/network-*.log | log_type=network | HTTP request logs |
/var/log/nginx/*access*.log | job=nginx, log_type=access | Nginx request logs |
/var/log/nginx/*error*.log | job=nginx, log_type=error | Nginx error logs |
Log Retention​
Loki is configured to auto-delete logs older than 30 days (retention_period: 720h in loki-config.yml).
Grafana Usage​
First Login​
- Go to
https://logs.staging.kloyst.com - Login:
admin/<GRAFANA_PASSWORD from .env> - Loki is pre-wired as the default datasource (auto-provisioned at startup)
- Navigate to Explore → Select Loki → start querying
Key LogQL Queries​
# All kloyst staging logs (real-time)
{project="kloyst-staging"}
# API Gateway logs only
{container="kloyst-staging-api-gateway"}
# All error-level logs
{project="kloyst-staging", level="error"}
# Exception file logs only
{log_type="exception"}
# HTTP network logs
{log_type="network"}
# Logs from a specific NestJS context (e.g., CampaignService)
{project="kloyst-staging"} |= "CampaignService"
# Nginx 5xx errors
{job="nginx", log_type="access"} |~ "\" 5[0-9]{2} "
# Error rate per minute
rate({project="kloyst-staging", level="error"}[1m])
Dozzle Usage​
Best for: Live debugging during a deployment, or when you want to see logs from a specific container without writing LogQL.
- Go to
https://dozzle.staging.kloyst.com - Login:
admin/<DOZZLE_PASSWORD from .env> - Click any container → live tail starts immediately
- Use the Group view to watch all
kloyst-staging-*containers simultaneously
Deploying the Monitoring Stack​
# First time
cd /srv/apps/kloyst-core/monitoring
cp .env.example .env
nano .env # Set GRAFANA_PASSWORD and DOZZLE_PASSWORD
# Get SSL certs for monitoring domains
sudo certbot certonly --nginx \
-d logs.staging.kloyst.com \
-d dozzle.staging.kloyst.com
# Symlink Nginx config
sudo ln -sf /srv/apps/kloyst-core/monitoring/nginx/monitoring.conf \
/etc/nginx/auto-sites/kloyst-monitoring.conf
sudo nginx -t && sudo systemctl reload nginx
# Start monitoring
docker compose -f docker-compose.monitoring.yml up -d
# Updates
docker compose -f docker-compose.monitoring.yml pull
docker compose -f docker-compose.monitoring.yml up -d
docker image prune -f
Checking Monitoring Health​
# Loki ready?
curl http://localhost:3100/ready
# Promtail sending logs?
curl http://localhost:9080/metrics | grep promtail_sent_entries_total
# All containers running?
docker compose -f docker-compose.monitoring.yml ps