Add generic OIDC provider support (Authentik, Keycloak, Zitadel, Google as OIDC, …) #85

New issue

Closed

opened 2026-06-18 00:40:24 +00:00 by james · 0 comments

james commented

2026-06-18 00:40:24 +00:00

Owner

Follow-up to #12 / ADR-0015. The current OAuth implementation only handles plain OAuth2 with hand-written fetchProfile per provider (GitHub today). Many self-hosters consolidate their auth behind an OIDC IdP — Authentik is the most common in Carol's target audience, but Keycloak, Zitadel, and "Google as OIDC instead of OAuth2 userinfo" all benefit from the same code path.

ADR-0015 explicitly deferred OIDC:

OpenID Connect deferred. GitHub uses plain OAuth2. … v1 stays with userinfo-only. OIDC arrives in a follow-up that brings the JWKS handling.

This is that follow-up.

Scope

A single generic-oidc provider that's configured dynamically from env, not a per-IdP entry in the static PROVIDERS registry. The self-hoster declares one or more OIDC instances:

# Required per instance.
OIDC_AUTHENTIK_ISSUER=https://auth.example.com/application/o/carol/
OIDC_AUTHENTIK_CLIENT_ID=…
OIDC_AUTHENTIK_CLIENT_SECRET=…
OIDC_AUTHENTIK_LABEL=Authentik

# Optional per-endpoint overrides. Take precedence over the
# corresponding fields in the discovery doc when set. Use these
# only when the IdP mis-publishes its discovery doc — see
# "Per-endpoint overrides" below.
OIDC_AUTHENTIK_AUTH_ENDPOINT=…       # overrides authorization_endpoint
OIDC_AUTHENTIK_TOKEN_ENDPOINT=…      # overrides token_endpoint
OIDC_AUTHENTIK_USERINFO_ENDPOINT=…   # overrides userinfo_endpoint
OIDC_AUTHENTIK_JWKS_URI=…            # overrides jwks_uri

The naming pattern (OIDC_<NAME>_*) lets multiple OIDC IdPs coexist (e.g. a homelab Authentik for the admin + a Google-via-OIDC for invitees). LABEL controls button copy; <NAME> is the URL slug used in /api/auth/oauth/callback/<name>.

OIDC discovery via <issuer>/.well-known/openid-configuration — fetched once at startup (or on first use), cached. The doc supplies authorization_endpoint, token_endpoint, userinfo_endpoint, and jwks_uri. Discovery is the default path; the four endpoints below come from the doc unless an override env says otherwise.
Per-endpoint overrides. Each of the four endpoints can be overridden individually via OIDC_<NAME>_<ENDPOINT> env vars (see example above). Resolution order per endpoint: OIDC_<NAME>_<ENDPOINT> env var → discovery-doc value → startup error if neither is present.

Why we ship this on day one:
- Older Authentik versions had the well-known doc publish a token_endpoint that included a trailing slash but the actual endpoint rejected it (and vice-versa). The override unblocks the self-hoster without waiting for an IdP upgrade.
- Keycloak realms reachable through a reverse-proxy sometimes have a discovery doc that returns the internal hostname for individual endpoints; pointing those at the public hostname needs a per-endpoint override, not an issuer change (the issuer claim must still match what the IdP signs into id_tokens).
- Forward-compat: future IdP quirks land here as data, not as a code change.
Overrides are validated the same way discovered values are (must be https://, must be same-origin as the issuer for the auth + token + userinfo endpoints; JWKS may be a different host since some IdPs publish JWKS from a CDN).
id_token validation (the load-bearing security piece, and why it was deferred):
- JWKS fetch from the resolved jwks_uri (discovery or override), cached with TTL, refreshed on kid miss.
- Verify the JWT signature against the JWKS entry matching the token header's kid.
- Verify iss matches the configured issuer (no override — iss is the IdP's self-identification and must match the trust anchor we configured).
- Verify aud includes the configured client_id.
- Verify exp (not expired) and iat/nbf within tolerance.
- Verify nonce matches a value Carol set at /start (new cookie in addition to state + PKCE).
- Extract sub as providerUserId, email + email_verified for the email policy.
All existing security invariants from ADR-0015 hold.
- "Do not auto-merge by email" — same policy applies (refuse signup when email_in_use; #72 unblocks the recovery flow).
- email_verified === true required. The id_token carries the flag; refuse if false.
- state, PKCE (S256), redirect-URI lock-in, mix-up defence, in-same-response cookie clear — all unchanged; just add the nonce cookie alongside state + PKCE.
Routes work without code changes. The existing /api/auth/oauth/start and /api/auth/oauth/callback/[provider] accept the dynamic provider id; the registry / resolution layer surfaces all enabled providers (static + each configured OIDC instance) via getEnabledProviders().
UI works without code changes. <OAuthButtons> already renders one button per enabled provider; OIDC instances appear automatically. Account /account page lists per-instance linked identities the same way.
ADR-0016 records the load-bearing pieces:
- JWKS handling: cache strategy + rotation policy + the kid-miss refresh.
- Discovery caching policy.
- Endpoint resolution order (env override → discovery → error) and the rationale for shipping the override path on day one rather than as a follow-up.
- Why a single generic-oidc provider over named per-IdP entries (one code path, env-driven configuration, fits self-hoster workflow).
- Why we still do id_token-only and not "userinfo as fallback if id_token lacks email" — keeps the path simple; if a provider doesn't put email/email_verified in the id_token's standard claims, it's not v1-supported.
- Where Authentik / Keycloak / Zitadel quirks have to live — overrides for endpoint mismatches, a single ADR appendix list for known-bad discovery docs.

Acceptance criteria

A self-hoster with an Authentik OIDC application can set the four OIDC_AUTHENTIK_* env vars + register the callback URL <APP_URL>/api/auth/oauth/callback/authentik in Authentik, and "Sign in with Authentik" appears on /login, /register, and /account "Connect another".
Sign-in, sign-up, link, unlink flows all work against Authentik with the same five branches the existing decision tree models (login_existing / signup_new / linked_to_current / refused_email_in_use / refused_belongs_to_other).
Multiple OIDC instances configured simultaneously (e.g. OIDC_AUTHENTIK_* + OIDC_KEYCLOAK_*) all surface as separate buttons and work independently. A user can link both to one Carol account.
Endpoint overrides take precedence over the discovery doc. Tests demonstrate the resolution order with a deliberately-mis-published discovery doc + an override env — the override wins, the IdP isn't re-fetched, and the sign-in flow completes.
Missing endpoint (neither override nor discovery) is a startup error, not a runtime failure on first sign-in attempt. Configuration validity is provable from npm run build / container start.
id_token validation refuses tokens with: bad signature, mismatched iss, missing aud, expired exp, mismatched nonce. Each refusal is a test case.
email_verified !== true in the id_token results in the existing no_verified_email flow.
Tests cover: discovery doc parsing, JWKS fetch + cache hit / cache miss / kid rotation, every id_token rejection path, the full sign-in flow against a mocked OIDC IdP, the override-precedence path, and the override + cross-origin JWKS host case (JWKS is allowed to live on a different host; auth/token/userinfo are not).
End-to-end against a local Authentik (manual probe documented in docs/oidc-self-hoster-guide.md).
ADR-0016 written and linked from docs/adr/README.md.
docs/oidc-self-hoster-guide.md (or a section in docs/ci.md) documents the Authentik + Keycloak setup recipes (one paragraph each — what to register where, which env vars to set, when to reach for an endpoint override).

Out of scope

Dynamic client registration (RFC 7591). Self-hosters register the OIDC client manually in their IdP; that's the standard self-hoster workflow.
Refresh-token support. Carol still discards the access token after profile read (ADR-0015 stance unchanged). If a future feature needs IdP-side group lookups or scope upgrades, that's its own ticket.
Auth0 / Okta / similar paid SSO with their own SDKs. Generic OIDC covers them at the protocol level; first-class branded providers are deferred.
Replacing the GitHub provider with OIDC. GitHub doesn't expose an OIDC discovery doc for the user-login flow; it stays as the plain-OAuth2 path. The OIDC code lives alongside, not above.
SCIM / user provisioning. Pure sign-in is in scope here.
Group/role claim mapping. If we later add roles beyond is_admin, mapping OIDC group claims onto Carol roles is its own ticket.
Overriding the issuer claim. iss stays anchored to OIDC_<NAME>_ISSUER and is checked against what the IdP signs into the id_token. A mismatch is fatal — adding an "expected_iss" override would let an attacker who controls a domain claim to be any other IdP.

Composes with

#12 / PR #73 — the OAuth backbone this PR rides on. The route handlers + the linking decision tree + the unlink action all stay; only the provider registry grows.
#72 — verified-email recovery for the email_in_use lockout. OIDC providers all expose email_verified, so the recovery flow becomes more reliably usable once both ship.
ADR-0015 — explicitly named this follow-up. Reading the "Threat model and responses" + "OpenID Connect deferred" sections is the starting point for ADR-0016.

Part of epic #1.

Follow-up to #12 / ADR-0015. The current OAuth implementation only handles plain OAuth2 with hand-written `fetchProfile` per provider (GitHub today). Many self-hosters consolidate their auth behind an OIDC IdP — **Authentik** is the most common in Carol's target audience, but Keycloak, Zitadel, and "Google as OIDC instead of OAuth2 userinfo" all benefit from the same code path. ADR-0015 explicitly deferred OIDC: > **OpenID Connect deferred.** GitHub uses plain OAuth2. … v1 stays with userinfo-only. OIDC arrives in a follow-up that brings the JWKS handling. This is that follow-up. ## Scope - **A single `generic-oidc` provider** that's configured *dynamically* from env, not a per-IdP entry in the static `PROVIDERS` registry. The self-hoster declares one or more OIDC instances: ```env # Required per instance. OIDC_AUTHENTIK_ISSUER=https://auth.example.com/application/o/carol/ OIDC_AUTHENTIK_CLIENT_ID=… OIDC_AUTHENTIK_CLIENT_SECRET=… OIDC_AUTHENTIK_LABEL=Authentik # Optional per-endpoint overrides. Take precedence over the # corresponding fields in the discovery doc when set. Use these # only when the IdP mis-publishes its discovery doc — see # "Per-endpoint overrides" below. OIDC_AUTHENTIK_AUTH_ENDPOINT=… # overrides authorization_endpoint OIDC_AUTHENTIK_TOKEN_ENDPOINT=… # overrides token_endpoint OIDC_AUTHENTIK_USERINFO_ENDPOINT=… # overrides userinfo_endpoint OIDC_AUTHENTIK_JWKS_URI=… # overrides jwks_uri ``` The naming pattern (`OIDC_<NAME>_*`) lets multiple OIDC IdPs coexist (e.g. a homelab Authentik for the admin + a Google-via-OIDC for invitees). `LABEL` controls button copy; `<NAME>` is the URL slug used in `/api/auth/oauth/callback/<name>`. - **OIDC discovery** via `<issuer>/.well-known/openid-configuration` — fetched once at startup (or on first use), cached. The doc supplies `authorization_endpoint`, `token_endpoint`, `userinfo_endpoint`, and `jwks_uri`. Discovery is the default path; the four endpoints below come from the doc unless an override env says otherwise. - **Per-endpoint overrides.** Each of the four endpoints can be overridden individually via `OIDC_<NAME>_<ENDPOINT>` env vars (see example above). Resolution order per endpoint: `OIDC_<NAME>_<ENDPOINT>` env var → discovery-doc value → startup error if neither is present. Why we ship this on day one: - Older Authentik versions had the well-known doc publish a `token_endpoint` that included a trailing slash but the actual endpoint rejected it (and vice-versa). The override unblocks the self-hoster without waiting for an IdP upgrade. - Keycloak realms reachable through a reverse-proxy sometimes have a discovery doc that returns the internal hostname for individual endpoints; pointing those at the public hostname needs a per-endpoint override, not an issuer change (the issuer claim must still match what the IdP signs into id_tokens). - Forward-compat: future IdP quirks land here as data, not as a code change. Overrides are validated the same way discovered values are (must be `https://`, must be same-origin as the issuer for the auth + token + userinfo endpoints; JWKS may be a different host since some IdPs publish JWKS from a CDN). - **id_token validation** (the load-bearing security piece, and why it was deferred): - JWKS fetch from the resolved `jwks_uri` (discovery or override), cached with TTL, refreshed on `kid` miss. - Verify the JWT signature against the JWKS entry matching the token header's `kid`. - Verify `iss` matches the configured issuer (no override — `iss` is the IdP's self-identification and must match the trust anchor we configured). - Verify `aud` includes the configured `client_id`. - Verify `exp` (not expired) and `iat`/`nbf` within tolerance. - Verify `nonce` matches a value Carol set at /start (new cookie in addition to state + PKCE). - Extract `sub` as `providerUserId`, `email` + `email_verified` for the email policy. - **All existing security invariants from ADR-0015 hold.** - "Do not auto-merge by email" — same policy applies (refuse signup when `email_in_use`; #72 unblocks the recovery flow). - `email_verified === true` required. The id_token carries the flag; refuse if false. - state, PKCE (S256), redirect-URI lock-in, mix-up defence, in-same-response cookie clear — all unchanged; just add the `nonce` cookie alongside state + PKCE. - **Routes work without code changes.** The existing `/api/auth/oauth/start` and `/api/auth/oauth/callback/[provider]` accept the dynamic provider id; the registry / resolution layer surfaces all enabled providers (static + each configured OIDC instance) via `getEnabledProviders()`. - **UI works without code changes.** `<OAuthButtons>` already renders one button per enabled provider; OIDC instances appear automatically. Account /account page lists per-instance linked identities the same way. - **ADR-0016** records the load-bearing pieces: - JWKS handling: cache strategy + rotation policy + the `kid`-miss refresh. - Discovery caching policy. - **Endpoint resolution order** (env override → discovery → error) and the rationale for shipping the override path on day one rather than as a follow-up. - Why a single generic-oidc provider over named per-IdP entries (one code path, env-driven configuration, fits self-hoster workflow). - Why we still do id_token-only and not "userinfo as fallback if id_token lacks email" — keeps the path simple; if a provider doesn't put `email`/`email_verified` in the id_token's standard claims, it's not v1-supported. - Where Authentik / Keycloak / Zitadel quirks have to live — overrides for endpoint mismatches, a single ADR appendix list for known-bad discovery docs. ## Acceptance criteria - [ ] A self-hoster with an Authentik OIDC application can set the four `OIDC_AUTHENTIK_*` env vars + register the callback URL `<APP_URL>/api/auth/oauth/callback/authentik` in Authentik, and "Sign in with Authentik" appears on `/login`, `/register`, and `/account` "Connect another". - [ ] Sign-in, sign-up, link, unlink flows all work against Authentik with the same five branches the existing decision tree models (login_existing / signup_new / linked_to_current / refused_email_in_use / refused_belongs_to_other). - [ ] Multiple OIDC instances configured simultaneously (e.g. `OIDC_AUTHENTIK_*` + `OIDC_KEYCLOAK_*`) all surface as separate buttons and work independently. A user can link both to one Carol account. - [ ] **Endpoint overrides take precedence over the discovery doc.** Tests demonstrate the resolution order with a deliberately-mis-published discovery doc + an override env — the override wins, the IdP isn't re-fetched, and the sign-in flow completes. - [ ] **Missing endpoint (neither override nor discovery) is a startup error**, not a runtime failure on first sign-in attempt. Configuration validity is provable from `npm run build` / container start. - [ ] id_token validation refuses tokens with: bad signature, mismatched `iss`, missing `aud`, expired `exp`, mismatched `nonce`. Each refusal is a test case. - [ ] `email_verified !== true` in the id_token results in the existing `no_verified_email` flow. - [ ] Tests cover: discovery doc parsing, JWKS fetch + cache hit / cache miss / `kid` rotation, every id_token rejection path, the full sign-in flow against a mocked OIDC IdP, the override-precedence path, **and the override + cross-origin JWKS host case** (JWKS is allowed to live on a different host; auth/token/userinfo are not). - [ ] End-to-end against a local Authentik (manual probe documented in `docs/oidc-self-hoster-guide.md`). - [ ] ADR-0016 written and linked from `docs/adr/README.md`. - [ ] `docs/oidc-self-hoster-guide.md` (or a section in `docs/ci.md`) documents the Authentik + Keycloak setup recipes (one paragraph each — what to register where, which env vars to set, when to reach for an endpoint override). ## Out of scope - Dynamic client registration (RFC 7591). Self-hosters register the OIDC client manually in their IdP; that's the standard self-hoster workflow. - Refresh-token support. Carol still discards the access token after profile read (ADR-0015 stance unchanged). If a future feature needs IdP-side group lookups or scope upgrades, that's its own ticket. - Auth0 / Okta / similar paid SSO with their own SDKs. Generic OIDC covers them at the protocol level; first-class branded providers are deferred. - Replacing the GitHub provider with OIDC. GitHub doesn't expose an OIDC discovery doc for the user-login flow; it stays as the plain-OAuth2 path. The OIDC code lives alongside, not above. - SCIM / user provisioning. Pure sign-in is in scope here. - Group/role claim mapping. If we later add roles beyond `is_admin`, mapping OIDC group claims onto Carol roles is its own ticket. - **Overriding the issuer claim.** `iss` stays anchored to `OIDC_<NAME>_ISSUER` and is checked against what the IdP signs into the id_token. A mismatch is fatal — adding an "expected_iss" override would let an attacker who controls a domain claim to be any other IdP. ## Composes with - **#12 / PR #73** — the OAuth backbone this PR rides on. The route handlers + the linking decision tree + the unlink action all stay; only the provider registry grows. - **#72** — verified-email recovery for the `email_in_use` lockout. OIDC providers all expose `email_verified`, so the recovery flow becomes more reliably usable once both ship. - **ADR-0015** — explicitly named this follow-up. Reading the "Threat model and responses" + "OpenID Connect deferred" sections is the starting point for ADR-0016. Part of epic #1.

james added the

area:auth

label

2026-06-18 00:41:10 +00:00

james referenced this issue from a commit

2026-06-18 02:14:16 +00:00

feat(auth): generic OIDC provider with discovery + per-endpoint overrides

james referenced this issue from a pull request that will close it,

2026-06-18 02:14:56 +00:00

feat(auth): generic OIDC provider with discovery + per-endpoint overrides (#85) #96

james referenced this issue from a commit

2026-06-18 02:29:00 +00:00