feat: add Logto healthcheck to /health endpoint #2
No reviewers
Labels
No labels
source:analyste
source:defenseur
source:human
source:medic
status:approved
status:blocked
status:in-progress
status:needs-fix
status:ready
status:review
status:triage
type:bug
type:feature
type:infra
type:refactor
type:schema
type:security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: maximus/vps-health-api#2
Loading…
Reference in a new issue
No description provided.
Delete branch "issue-1-logto-healthcheck"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #1.
Changes
logto: {status, responseTimeMs, error?}field in/healthresponseLOGTO_HEALTH_URLenv (default:https://auth.lacompagniemaximus.com/oidc/.well-known/openid-configuration)AbortController;/healthstays HTTP 200 even if Logto is downgetCpuPercentconverted to async (setTimeout-based delay) so the 500ms CPU sample and the Logto fetch run concurrently viaPromise.all; total latency staysmax(500ms, <=3000ms)instead of the sumCLAUDE.md(previously untracked) with the new field documentedSmoke tests (local)
Last row confirms the AbortController timeout bound and that
/healthstill returns 200 on Logto down.Acceptance criteria
/healthresponse includeslogto: {status, responseTimeMs, error?}/healthstays HTTP 200 when Logto is downPromise.all)Verdict: APPROVE
Summary
Clean implementation of the Logto healthcheck with correct timeout bounding, proper parallelization, and fail-safe error handling. Smoke test matrix covers the four key scenarios (happy path, DNS fail, HTTP error, timeout) and confirms
/healthstays HTTP 200 on Logto down.Checklist walkthrough
Security — OK. No secrets in the diff,
.env.exampleupdated (not.env), URL goes throughfetch(no shell/SQL). Auth gate still runs beforegetHealth().Correctness — OK.
getCpuPercentrewrite fromexecSync("sleep 0.5")toawait delay(500)removes blocking and enables parallelism.Promise.all([getCpuPercent(), getLogtoHealth()])givesmax(500ms, <=3000ms)as claimed.AbortController+clearTimeoutinfinally— no leaked timer.getLogtoHealthalways resolves (never throws), so/healthstays 200 when Logto is down. Confirmed by row 4 of the smoke matrix (httpbin delay/10).AbortError→"timeout".Tests — No automated test suite exists for this ~127-line service, consistent with project conventions. The PR documents a 4-scenario manual smoke matrix including the timeout boundary. Acceptable.
Quality — Minimal, idiomatic, with a good inline comment explaining the async CPU sampler.
CLAUDE.mdnow tracked and updated.Data — N/A (no DB, no migrations).
Suggestions (non-blocking)
index.js:12—LOGTO_TIMEOUT_MSis hardcoded. If you ever want to tune it in prod without a rebuild, expose it via env. Not needed right now.index.js:155— the 500 branch returnserr.message(pre-existing pattern). Fine while the endpoint is auth-gated, but worth keeping in mind if the service is ever fronted by a public probe — areadFileSyncerror could leak a filesystem path.index.js:47-63— optional: log a server-side line on Logtodownso post-mortems don't depend on catching the JSON response at the moment it happened.