- Hardened cron endpoints with coordination and auth improvements
- Added API guards and input validation layer
- Security observability and secrets health checks
- Monitoring types and service improvements
- PDF URL validation and newsletter unsubscribe security
- Unit tests for security-critical paths
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SSE stream: detect client disconnect via request.signal to stop
polling loop (prevents wasted DB queries after tab close)
- AlertEvaluator: split shouldFire/recordFired so cooldown is only
recorded after successful dispatch (prevents alert suppression
on dispatch failure)
- SnapshotCollector: cache payload instance (avoid re-importing on
every 60s tick)
- Alert acknowledge: validate alertId type (string|number)
- Logs search: add 300ms debounce to prevent query-per-keystroke
- Replace remaining `any` cast with Record<string, unknown>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The fire-and-forget dynamic import chain (3 awaits) was racing with
test flush timeouts. Caching the resolved payload instance fixes both
the flakiness and eliminates per-call import overhead.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Periodic metric collection running in the queue-worker PM2 process.
Collects system metrics every 60s (configurable), stores them in
MonitoringSnapshots, and evaluates alert rules against each snapshot.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fire-and-forget logger that writes to the monitoring-logs collection
with log level filtering via MONITORING_LOG_LEVEL env var. Falls back
to console output when Payload is not yet initialized.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements checkSystemHealth (CPU, memory, disk, load), service checks
(Redis, PostgreSQL, PgBouncer, SMTP, queues, OAuth, cron), and the
collectMetrics aggregator that gathers all metrics in parallel.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>