What's new on stats.iduck.xyz.
This page itself. The changelog is now sourced from a single markdown file
and rendered live so customers can see what's shipping without poking us
on Discord. Admin-only editor at /admin/changelog/edit saves a timestamped
backup on every write.
Probe agents now back off when the push endpoint returns 5xx or times out (1s → 2s → 4s … capped at 60s) instead of hammering the API every interval. Cuts gateway log noise during Duck Stats restart windows by ~95%.
Nginx in front of the API was caching upstream DNS at startup, so when
watchtower restarted the api container and Docker handed it a new IP,
the gateway kept routing to the dead address until manually reloaded.
Switched to resolver 127.0.0.11 valid=10s with per-location set $upstream
variables. No more manual gateway restarts after upgrades.
Each probe location gets its own config_token for the public
/api/probe-config/<token> endpoint. Rotating one location no longer
disrupts any other. Old shared-token configs were migrated automatically;
no action needed.
You can now register multiple probe locations (e.g. frankfurt, oracle-eu,
home-lab) and assign HTTP/HTTPS/TCP/Ping targets to a subset of them. Each
location pulls its config from a public endpoint and pushes results back as
Duck Stats push monitors. Status page shows one row per (target × location)
so you can see exactly where a check is failing from.
The Servers page was doing one DB query per card to fetch the 60-minute CPU
sparkline (N+1 — 18 monitors = 18 queries). Replaced with a single
SELECT ... WHERE monitor_id IN (...) and bucketed in Python. Page load
on the fleet view dropped from ~1.4s to ~110ms.
duck-discord-notifier was racing against Duck Stats's own notification
channel when a monitor flapped DOWN→UP→DOWN within the 90s discord update
window. Unlinked individual monitors from the Discord notification channel
in Duck Stats; the status board embed is now the single source of truth for
public-facing alerts. Email + Pushover still fire as before.
Switched the 7-day heartbeat prune from DELETE WHERE time < ... to a
table-swap (rename old → new, swap, drop). The old DELETE was holding
row locks long enough to delay live writes from the ingester during the
3am window. Prune now completes in ~4s instead of ~90s.
The period column on the stats table was being written as 0/1/2
instead of the documented 1/2/3 (minutely/hourly/daily). Backfilled
historical rows; reads now correctly join across all three resolutions.
Daily rollups for May 2026 are now visible on the status page.
Stats and heartbeats have been migrated off MongoDB onto MySQL HeatWave running on Oracle Cloud (10.0.90.123). Queries against the 7-day heartbeat window are 5-12× faster. The MongoDB stats repository is kept as a fallback shim for one release and will be removed in v2.1.
FindUptimeSumByMonitorIDs now folds the per-monitor uptime sum into one
batched query. The public status page used to issue one query per monitor
(N+1 again) — for the ducktv page that's 38 queries on every load.
Cold-load TTFB dropped from 1.8s to 240ms.
If SECRET_KEY is not set in the env, mon-admin used to fall back to a
hardcoded dev key. Now it generates a random secret on startup
(secrets.token_hex(32)) so a forgotten env var doesn't leak sessions
across restarts. Set SECRET_KEY explicitly if you want sessions to
survive container restarts.
Outgoing checks to 95.216.0.0/16 (Hetzner node fleet) were getting
rate-limited because they were exiting from Frost's secondary IP, which
Hetzner had on a fail2ban list. Added iptables -t nat -A POSTROUTING
-d 95.216.0.0/16 -j SNAT --to-source 129.146.92.139 and saved via
netfilter-persistent. No more 30-second TCP check timeouts.
mon-admin has been restyled with Tabler. Cards, badges, sidebar nav, breadcrumbs — all matching the rest of the Duck dashboard. Dark mode support is queued for v2.3.
The host health agent now appends a Disks=root=42%,storage=71% chunk to
each push message. mon-admin parses it and shows the worst-disk bar on
the server card. Existing agents will be auto-upgraded over the next 48h
via the deploy script; nothing to do.