Add generic OIDC provider support (Authentik, Keycloak, Zitadel, Google as OIDC, …) #85

Closed
opened 2026-06-18 00:40:24 +00:00 by james · 0 comments
Owner

Follow-up to #12 / ADR-0015. The current OAuth implementation only handles plain OAuth2 with hand-written fetchProfile per provider (GitHub today). Many self-hosters consolidate their auth behind an OIDC IdP — Authentik is the most common in Carol's target audience, but Keycloak, Zitadel, and "Google as OIDC instead of OAuth2 userinfo" all benefit from the same code path.

ADR-0015 explicitly deferred OIDC:

OpenID Connect deferred. GitHub uses plain OAuth2. … v1 stays with userinfo-only. OIDC arrives in a follow-up that brings the JWKS handling.

This is that follow-up.

Scope

  • A single generic-oidc provider that's configured dynamically from env, not a per-IdP entry in the static PROVIDERS registry. The self-hoster declares one or more OIDC instances:

    # Required per instance.
    OIDC_AUTHENTIK_ISSUER=https://auth.example.com/application/o/carol/
    OIDC_AUTHENTIK_CLIENT_ID=OIDC_AUTHENTIK_CLIENT_SECRET=OIDC_AUTHENTIK_LABEL=Authentik
    
    # Optional per-endpoint overrides. Take precedence over the
    # corresponding fields in the discovery doc when set. Use these
    # only when the IdP mis-publishes its discovery doc — see
    # "Per-endpoint overrides" below.
    OIDC_AUTHENTIK_AUTH_ENDPOINT=# overrides authorization_endpoint
    OIDC_AUTHENTIK_TOKEN_ENDPOINT=# overrides token_endpoint
    OIDC_AUTHENTIK_USERINFO_ENDPOINT=# overrides userinfo_endpoint
    OIDC_AUTHENTIK_JWKS_URI=# overrides jwks_uri
    

    The naming pattern (OIDC_<NAME>_*) lets multiple OIDC IdPs coexist (e.g. a homelab Authentik for the admin + a Google-via-OIDC for invitees). LABEL controls button copy; <NAME> is the URL slug used in /api/auth/oauth/callback/<name>.

  • OIDC discovery via <issuer>/.well-known/openid-configuration — fetched once at startup (or on first use), cached. The doc supplies authorization_endpoint, token_endpoint, userinfo_endpoint, and jwks_uri. Discovery is the default path; the four endpoints below come from the doc unless an override env says otherwise.

  • Per-endpoint overrides. Each of the four endpoints can be overridden individually via OIDC_<NAME>_<ENDPOINT> env vars (see example above). Resolution order per endpoint: OIDC_<NAME>_<ENDPOINT> env var → discovery-doc value → startup error if neither is present.

    Why we ship this on day one:

    • Older Authentik versions had the well-known doc publish a token_endpoint that included a trailing slash but the actual endpoint rejected it (and vice-versa). The override unblocks the self-hoster without waiting for an IdP upgrade.
    • Keycloak realms reachable through a reverse-proxy sometimes have a discovery doc that returns the internal hostname for individual endpoints; pointing those at the public hostname needs a per-endpoint override, not an issuer change (the issuer claim must still match what the IdP signs into id_tokens).
    • Forward-compat: future IdP quirks land here as data, not as a code change.

    Overrides are validated the same way discovered values are (must be https://, must be same-origin as the issuer for the auth + token + userinfo endpoints; JWKS may be a different host since some IdPs publish JWKS from a CDN).

  • id_token validation (the load-bearing security piece, and why it was deferred):

    • JWKS fetch from the resolved jwks_uri (discovery or override), cached with TTL, refreshed on kid miss.
    • Verify the JWT signature against the JWKS entry matching the token header's kid.
    • Verify iss matches the configured issuer (no override — iss is the IdP's self-identification and must match the trust anchor we configured).
    • Verify aud includes the configured client_id.
    • Verify exp (not expired) and iat/nbf within tolerance.
    • Verify nonce matches a value Carol set at /start (new cookie in addition to state + PKCE).
    • Extract sub as providerUserId, email + email_verified for the email policy.
  • All existing security invariants from ADR-0015 hold.

    • "Do not auto-merge by email" — same policy applies (refuse signup when email_in_use; #72 unblocks the recovery flow).
    • email_verified === true required. The id_token carries the flag; refuse if false.
    • state, PKCE (S256), redirect-URI lock-in, mix-up defence, in-same-response cookie clear — all unchanged; just add the nonce cookie alongside state + PKCE.
  • Routes work without code changes. The existing /api/auth/oauth/start and /api/auth/oauth/callback/[provider] accept the dynamic provider id; the registry / resolution layer surfaces all enabled providers (static + each configured OIDC instance) via getEnabledProviders().

  • UI works without code changes. <OAuthButtons> already renders one button per enabled provider; OIDC instances appear automatically. Account /account page lists per-instance linked identities the same way.

  • ADR-0016 records the load-bearing pieces:

    • JWKS handling: cache strategy + rotation policy + the kid-miss refresh.
    • Discovery caching policy.
    • Endpoint resolution order (env override → discovery → error) and the rationale for shipping the override path on day one rather than as a follow-up.
    • Why a single generic-oidc provider over named per-IdP entries (one code path, env-driven configuration, fits self-hoster workflow).
    • Why we still do id_token-only and not "userinfo as fallback if id_token lacks email" — keeps the path simple; if a provider doesn't put email/email_verified in the id_token's standard claims, it's not v1-supported.
    • Where Authentik / Keycloak / Zitadel quirks have to live — overrides for endpoint mismatches, a single ADR appendix list for known-bad discovery docs.

Acceptance criteria

  • A self-hoster with an Authentik OIDC application can set the four OIDC_AUTHENTIK_* env vars + register the callback URL <APP_URL>/api/auth/oauth/callback/authentik in Authentik, and "Sign in with Authentik" appears on /login, /register, and /account "Connect another".
  • Sign-in, sign-up, link, unlink flows all work against Authentik with the same five branches the existing decision tree models (login_existing / signup_new / linked_to_current / refused_email_in_use / refused_belongs_to_other).
  • Multiple OIDC instances configured simultaneously (e.g. OIDC_AUTHENTIK_* + OIDC_KEYCLOAK_*) all surface as separate buttons and work independently. A user can link both to one Carol account.
  • Endpoint overrides take precedence over the discovery doc. Tests demonstrate the resolution order with a deliberately-mis-published discovery doc + an override env — the override wins, the IdP isn't re-fetched, and the sign-in flow completes.
  • Missing endpoint (neither override nor discovery) is a startup error, not a runtime failure on first sign-in attempt. Configuration validity is provable from npm run build / container start.
  • id_token validation refuses tokens with: bad signature, mismatched iss, missing aud, expired exp, mismatched nonce. Each refusal is a test case.
  • email_verified !== true in the id_token results in the existing no_verified_email flow.
  • Tests cover: discovery doc parsing, JWKS fetch + cache hit / cache miss / kid rotation, every id_token rejection path, the full sign-in flow against a mocked OIDC IdP, the override-precedence path, and the override + cross-origin JWKS host case (JWKS is allowed to live on a different host; auth/token/userinfo are not).
  • End-to-end against a local Authentik (manual probe documented in docs/oidc-self-hoster-guide.md).
  • ADR-0016 written and linked from docs/adr/README.md.
  • docs/oidc-self-hoster-guide.md (or a section in docs/ci.md) documents the Authentik + Keycloak setup recipes (one paragraph each — what to register where, which env vars to set, when to reach for an endpoint override).

Out of scope

  • Dynamic client registration (RFC 7591). Self-hosters register the OIDC client manually in their IdP; that's the standard self-hoster workflow.
  • Refresh-token support. Carol still discards the access token after profile read (ADR-0015 stance unchanged). If a future feature needs IdP-side group lookups or scope upgrades, that's its own ticket.
  • Auth0 / Okta / similar paid SSO with their own SDKs. Generic OIDC covers them at the protocol level; first-class branded providers are deferred.
  • Replacing the GitHub provider with OIDC. GitHub doesn't expose an OIDC discovery doc for the user-login flow; it stays as the plain-OAuth2 path. The OIDC code lives alongside, not above.
  • SCIM / user provisioning. Pure sign-in is in scope here.
  • Group/role claim mapping. If we later add roles beyond is_admin, mapping OIDC group claims onto Carol roles is its own ticket.
  • Overriding the issuer claim. iss stays anchored to OIDC_<NAME>_ISSUER and is checked against what the IdP signs into the id_token. A mismatch is fatal — adding an "expected_iss" override would let an attacker who controls a domain claim to be any other IdP.

Composes with

  • #12 / PR #73 — the OAuth backbone this PR rides on. The route handlers + the linking decision tree + the unlink action all stay; only the provider registry grows.
  • #72 — verified-email recovery for the email_in_use lockout. OIDC providers all expose email_verified, so the recovery flow becomes more reliably usable once both ship.
  • ADR-0015 — explicitly named this follow-up. Reading the "Threat model and responses" + "OpenID Connect deferred" sections is the starting point for ADR-0016.

Part of epic #1.

Follow-up to #12 / ADR-0015. The current OAuth implementation only handles plain OAuth2 with hand-written `fetchProfile` per provider (GitHub today). Many self-hosters consolidate their auth behind an OIDC IdP — **Authentik** is the most common in Carol's target audience, but Keycloak, Zitadel, and "Google as OIDC instead of OAuth2 userinfo" all benefit from the same code path. ADR-0015 explicitly deferred OIDC: > **OpenID Connect deferred.** GitHub uses plain OAuth2. … v1 stays with userinfo-only. OIDC arrives in a follow-up that brings the JWKS handling. This is that follow-up. ## Scope - **A single `generic-oidc` provider** that's configured *dynamically* from env, not a per-IdP entry in the static `PROVIDERS` registry. The self-hoster declares one or more OIDC instances: ```env # Required per instance. OIDC_AUTHENTIK_ISSUER=https://auth.example.com/application/o/carol/ OIDC_AUTHENTIK_CLIENT_ID=… OIDC_AUTHENTIK_CLIENT_SECRET=… OIDC_AUTHENTIK_LABEL=Authentik # Optional per-endpoint overrides. Take precedence over the # corresponding fields in the discovery doc when set. Use these # only when the IdP mis-publishes its discovery doc — see # "Per-endpoint overrides" below. OIDC_AUTHENTIK_AUTH_ENDPOINT=… # overrides authorization_endpoint OIDC_AUTHENTIK_TOKEN_ENDPOINT=… # overrides token_endpoint OIDC_AUTHENTIK_USERINFO_ENDPOINT=… # overrides userinfo_endpoint OIDC_AUTHENTIK_JWKS_URI=… # overrides jwks_uri ``` The naming pattern (`OIDC_<NAME>_*`) lets multiple OIDC IdPs coexist (e.g. a homelab Authentik for the admin + a Google-via-OIDC for invitees). `LABEL` controls button copy; `<NAME>` is the URL slug used in `/api/auth/oauth/callback/<name>`. - **OIDC discovery** via `<issuer>/.well-known/openid-configuration` — fetched once at startup (or on first use), cached. The doc supplies `authorization_endpoint`, `token_endpoint`, `userinfo_endpoint`, and `jwks_uri`. Discovery is the default path; the four endpoints below come from the doc unless an override env says otherwise. - **Per-endpoint overrides.** Each of the four endpoints can be overridden individually via `OIDC_<NAME>_<ENDPOINT>` env vars (see example above). Resolution order per endpoint: `OIDC_<NAME>_<ENDPOINT>` env var → discovery-doc value → startup error if neither is present. Why we ship this on day one: - Older Authentik versions had the well-known doc publish a `token_endpoint` that included a trailing slash but the actual endpoint rejected it (and vice-versa). The override unblocks the self-hoster without waiting for an IdP upgrade. - Keycloak realms reachable through a reverse-proxy sometimes have a discovery doc that returns the internal hostname for individual endpoints; pointing those at the public hostname needs a per-endpoint override, not an issuer change (the issuer claim must still match what the IdP signs into id_tokens). - Forward-compat: future IdP quirks land here as data, not as a code change. Overrides are validated the same way discovered values are (must be `https://`, must be same-origin as the issuer for the auth + token + userinfo endpoints; JWKS may be a different host since some IdPs publish JWKS from a CDN). - **id_token validation** (the load-bearing security piece, and why it was deferred): - JWKS fetch from the resolved `jwks_uri` (discovery or override), cached with TTL, refreshed on `kid` miss. - Verify the JWT signature against the JWKS entry matching the token header's `kid`. - Verify `iss` matches the configured issuer (no override — `iss` is the IdP's self-identification and must match the trust anchor we configured). - Verify `aud` includes the configured `client_id`. - Verify `exp` (not expired) and `iat`/`nbf` within tolerance. - Verify `nonce` matches a value Carol set at /start (new cookie in addition to state + PKCE). - Extract `sub` as `providerUserId`, `email` + `email_verified` for the email policy. - **All existing security invariants from ADR-0015 hold.** - "Do not auto-merge by email" — same policy applies (refuse signup when `email_in_use`; #72 unblocks the recovery flow). - `email_verified === true` required. The id_token carries the flag; refuse if false. - state, PKCE (S256), redirect-URI lock-in, mix-up defence, in-same-response cookie clear — all unchanged; just add the `nonce` cookie alongside state + PKCE. - **Routes work without code changes.** The existing `/api/auth/oauth/start` and `/api/auth/oauth/callback/[provider]` accept the dynamic provider id; the registry / resolution layer surfaces all enabled providers (static + each configured OIDC instance) via `getEnabledProviders()`. - **UI works without code changes.** `<OAuthButtons>` already renders one button per enabled provider; OIDC instances appear automatically. Account /account page lists per-instance linked identities the same way. - **ADR-0016** records the load-bearing pieces: - JWKS handling: cache strategy + rotation policy + the `kid`-miss refresh. - Discovery caching policy. - **Endpoint resolution order** (env override → discovery → error) and the rationale for shipping the override path on day one rather than as a follow-up. - Why a single generic-oidc provider over named per-IdP entries (one code path, env-driven configuration, fits self-hoster workflow). - Why we still do id_token-only and not "userinfo as fallback if id_token lacks email" — keeps the path simple; if a provider doesn't put `email`/`email_verified` in the id_token's standard claims, it's not v1-supported. - Where Authentik / Keycloak / Zitadel quirks have to live — overrides for endpoint mismatches, a single ADR appendix list for known-bad discovery docs. ## Acceptance criteria - [ ] A self-hoster with an Authentik OIDC application can set the four `OIDC_AUTHENTIK_*` env vars + register the callback URL `<APP_URL>/api/auth/oauth/callback/authentik` in Authentik, and "Sign in with Authentik" appears on `/login`, `/register`, and `/account` "Connect another". - [ ] Sign-in, sign-up, link, unlink flows all work against Authentik with the same five branches the existing decision tree models (login_existing / signup_new / linked_to_current / refused_email_in_use / refused_belongs_to_other). - [ ] Multiple OIDC instances configured simultaneously (e.g. `OIDC_AUTHENTIK_*` + `OIDC_KEYCLOAK_*`) all surface as separate buttons and work independently. A user can link both to one Carol account. - [ ] **Endpoint overrides take precedence over the discovery doc.** Tests demonstrate the resolution order with a deliberately-mis-published discovery doc + an override env — the override wins, the IdP isn't re-fetched, and the sign-in flow completes. - [ ] **Missing endpoint (neither override nor discovery) is a startup error**, not a runtime failure on first sign-in attempt. Configuration validity is provable from `npm run build` / container start. - [ ] id_token validation refuses tokens with: bad signature, mismatched `iss`, missing `aud`, expired `exp`, mismatched `nonce`. Each refusal is a test case. - [ ] `email_verified !== true` in the id_token results in the existing `no_verified_email` flow. - [ ] Tests cover: discovery doc parsing, JWKS fetch + cache hit / cache miss / `kid` rotation, every id_token rejection path, the full sign-in flow against a mocked OIDC IdP, the override-precedence path, **and the override + cross-origin JWKS host case** (JWKS is allowed to live on a different host; auth/token/userinfo are not). - [ ] End-to-end against a local Authentik (manual probe documented in `docs/oidc-self-hoster-guide.md`). - [ ] ADR-0016 written and linked from `docs/adr/README.md`. - [ ] `docs/oidc-self-hoster-guide.md` (or a section in `docs/ci.md`) documents the Authentik + Keycloak setup recipes (one paragraph each — what to register where, which env vars to set, when to reach for an endpoint override). ## Out of scope - Dynamic client registration (RFC 7591). Self-hosters register the OIDC client manually in their IdP; that's the standard self-hoster workflow. - Refresh-token support. Carol still discards the access token after profile read (ADR-0015 stance unchanged). If a future feature needs IdP-side group lookups or scope upgrades, that's its own ticket. - Auth0 / Okta / similar paid SSO with their own SDKs. Generic OIDC covers them at the protocol level; first-class branded providers are deferred. - Replacing the GitHub provider with OIDC. GitHub doesn't expose an OIDC discovery doc for the user-login flow; it stays as the plain-OAuth2 path. The OIDC code lives alongside, not above. - SCIM / user provisioning. Pure sign-in is in scope here. - Group/role claim mapping. If we later add roles beyond `is_admin`, mapping OIDC group claims onto Carol roles is its own ticket. - **Overriding the issuer claim.** `iss` stays anchored to `OIDC_<NAME>_ISSUER` and is checked against what the IdP signs into the id_token. A mismatch is fatal — adding an "expected_iss" override would let an attacker who controls a domain claim to be any other IdP. ## Composes with - **#12 / PR #73** — the OAuth backbone this PR rides on. The route handlers + the linking decision tree + the unlink action all stay; only the provider registry grows. - **#72** — verified-email recovery for the `email_in_use` lockout. OIDC providers all expose `email_verified`, so the recovery flow becomes more reliably usable once both ship. - **ADR-0015** — explicitly named this follow-up. Reading the "Threat model and responses" + "OpenID Connect deferred" sections is the starting point for ADR-0016. Part of epic #1.
james closed this issue 2026-06-18 02:32:42 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
james/carol#85
No description provided.