feat(api): LlmProvider adapters — Anthropic + OpenAI-compatible inference layer #336

Closed
opened 2026-06-28 23:37:56 +00:00 by james · 0 comments
Owner

The inference layer ADR-0029 §1 calls for and #333 deferred: a thin provider abstraction with two adapters, consuming the per-user config + decrypted key from #333 and the #51 tool registry. Part of epic #47.

Bounded to non-streaming inference + provider selection. Streaming, the agent loop, conversations/messages, and the chat UI are later tickets that build on this.

Scope

  • Interface (apps/api/lib/llm/): a client abstraction — name it to avoid colliding with the existing LlmProvider enum type from #333 (e.g. LlmClient): generate(req): Promise<LlmResult>. Uses ADR-0029 §4 normalized messages:
    • LlmMessage { role: "user"|"assistant"|"tool", content: string|null, toolCalls?: {id,name,input}[], toolCallId?: string }
    • LlmGenerateRequest { system?: string, messages: LlmMessage[], tools: LlmToolDef[], model: string, maxTokens?: number }
    • LlmResult { text: string, toolCalls: {id,name,input}[], stopReason: "end_turn"|"tool_use"|"max_tokens"|"refusal"|"other" }
  • Tools from #51: convert each ToolListEntry.inputSchema (zod) → JSON Schema (zod v4 z.toJSONSchema, as the MCP endpoint #331 did) → Anthropic tool / OpenAI function. A helper builds LlmToolDef[] from allTools().
  • Anthropic adapter@anthropic-ai/sdk. Normalized → Anthropic Messages API (system, messages with tool_use/tool_result blocks, tools as {name,description,input_schema}); messages.create; response content blocks (text + tool_use {id,name,input}) + stop_reasonLlmResult. Model from config. Inject the client (custom baseURL/fetch) for tests. Model-agnostic: omit thinking and sampling params for v1 (the loop ticket tunes effort/thinking/streaming).
  • OpenAI-compatible adapter — plain fetch to ${baseUrl}/chat/completions (OpenAI, OpenRouter, local Ollama). Normalized → OpenAI chat messages (assistant tool_calls, tool role with tool_call_id) + function-tools; parse → LlmResult. Inject fetch for tests. No openai dependency.
  • Provider selectiongetLlmClientForUser(userId, db) reads the #333 config, decrypts the key (getDecryptedLlmKey), returns the right adapter, or throws a typed "not configured" / "key required" error.
  • Tests (unit, mocked transport — no real network): request translation (normalized → each provider shape); response normalization (text + tool_calls + stop reason); a tool-use round; error mapping (auth/rate-limit/overloaded); provider selection from a stored config (#333 storage, sqlite memory).
  • Deps: add @anthropic-ai/sdk (pinned, established — vet against the install-script allowlist + package-age policy). No new env var, no migration, no UI.

Acceptance criteria

  • LlmClient interface + Anthropic and OpenAI-compatible adapters; both translate normalized messages + #51 tools → a provider call and normalize the response (text + tool_calls + stop reason).
  • getLlmClientForUser selects the adapter from the user's stored config + decrypted key (#333/#334); keyless local (Ollama) supported.
  • Tests use injected transport (no live API calls); typecheck/lint/semgrep green.

Out of scope

  • Streaming, the agent loop, conversations/messages, the SSE chat endpoint, the chat UI (next tickets, which consume this).

Depends on #333 (config + getDecryptedLlmKey) and #51 (registry). Implements ADR-0029.

The inference layer ADR-0029 §1 calls for and #333 deferred: a thin provider abstraction with two adapters, consuming the per-user config + decrypted key from #333 and the #51 tool registry. Part of epic #47. **Bounded to non-streaming inference + provider selection.** Streaming, the agent loop, `conversations`/`messages`, and the chat UI are later tickets that build on this. ## Scope - **Interface** (`apps/api/lib/llm/`): a client abstraction — name it to avoid colliding with the existing `LlmProvider` *enum type* from #333 (e.g. `LlmClient`): `generate(req): Promise<LlmResult>`. Uses ADR-0029 §4 normalized messages: - `LlmMessage { role: "user"|"assistant"|"tool", content: string|null, toolCalls?: {id,name,input}[], toolCallId?: string }` - `LlmGenerateRequest { system?: string, messages: LlmMessage[], tools: LlmToolDef[], model: string, maxTokens?: number }` - `LlmResult { text: string, toolCalls: {id,name,input}[], stopReason: "end_turn"|"tool_use"|"max_tokens"|"refusal"|"other" }` - **Tools from #51**: convert each `ToolListEntry.inputSchema` (zod) → JSON Schema (zod v4 `z.toJSONSchema`, as the MCP endpoint #331 did) → Anthropic tool / OpenAI function. A helper builds `LlmToolDef[]` from `allTools()`. - **Anthropic adapter** — `@anthropic-ai/sdk`. Normalized → Anthropic Messages API (system, messages with `tool_use`/`tool_result` blocks, `tools` as `{name,description,input_schema}`); `messages.create`; response `content` blocks (`text` + `tool_use {id,name,input}`) + `stop_reason` → `LlmResult`. Model from config. Inject the client (custom `baseURL`/`fetch`) for tests. Model-agnostic: omit `thinking` and sampling params for v1 (the loop ticket tunes effort/thinking/streaming). - **OpenAI-compatible adapter** — plain `fetch` to `${baseUrl}/chat/completions` (OpenAI, OpenRouter, **local Ollama**). Normalized → OpenAI chat messages (assistant `tool_calls`, `tool` role with `tool_call_id`) + function-tools; parse → `LlmResult`. Inject `fetch` for tests. No `openai` dependency. - **Provider selection** — `getLlmClientForUser(userId, db)` reads the #333 config, decrypts the key (`getDecryptedLlmKey`), returns the right adapter, or throws a typed "not configured" / "key required" error. - **Tests** (unit, mocked transport — no real network): request translation (normalized → each provider shape); response normalization (text + tool_calls + stop reason); a tool-use round; error mapping (auth/rate-limit/overloaded); provider selection from a stored config (#333 storage, sqlite memory). - **Deps**: add `@anthropic-ai/sdk` (pinned, established — vet against the install-script allowlist + package-age policy). No new env var, no migration, no UI. ## Acceptance criteria - [ ] `LlmClient` interface + Anthropic and OpenAI-compatible adapters; both translate normalized messages + #51 tools → a provider call and normalize the response (text + tool_calls + stop reason). - [ ] `getLlmClientForUser` selects the adapter from the user's stored config + decrypted key (#333/#334); keyless local (Ollama) supported. - [ ] Tests use injected transport (no live API calls); typecheck/lint/semgrep green. ## Out of scope - Streaming, the agent loop, `conversations`/`messages`, the SSE chat endpoint, the chat UI (next tickets, which consume this). Depends on #333 (config + `getDecryptedLlmKey`) and #51 (registry). Implements ADR-0029.
james closed this issue 2026-06-29 00:26:40 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
james/carol#336
No description provided.