Gateway Service

The self-hosted /v1/messages inference gateway and Cowork third-party platform integration. Routes requests across Anthropic, Bedrock, Vertex AI, Azure Foundry, OpenAI, Gemini, and Groq with identity propagation, signed audit trail, and the sp-cowork-auth credential helper.

Last updated:

After Reading This

You will be able to:

Point Claude Cowork's third-party inference at your local systemprompt.io instance
Understand the /v1/messages request lifecycle through the gateway
Mint credentials via the sp-cowork-auth helper binary
Map helper-script response headers to core typed IDs (UserId, SessionId, TraceId, ClientId, TenantId)
Route inference across Anthropic, Bedrock, Vertex AI, Azure Foundry, OpenAI, Gemini, and Groq
Configure the Cowork MCP allowlist and org-plugins supply chain

On this page

TL;DR — systemprompt.io exposes a /v1/messages endpoint that is wire-compatible with Anthropic's API, plus a /v1/auth/cowork/* family for Cowork's credential helper script. Point Claude Cowork's third-party inference at it and every prompt, tool call, and cost line lands in your own database. The sp-cowork-auth binary is the one-click companion that mints credentials for the helper script across personal, team, and enterprise deployments — same binary, three tiers of identity.

Why It Matters

Anthropic documents Claude Cowork running on customer infrastructure in three support articles (14680729, 14680741, 14680753). Inference can route through Bedrock, Vertex AI, Azure Foundry, or any LLM gateway exposing /v1/messages. The last option is what makes identity propagation, cross-provider routing, and audit lineage possible — but only if that gateway exists.

The gateway service is that endpoint. It runs inside the same systemprompt.io binary as the governance pipeline, the MCP registry, and the audit log — so every Cowork request inherits the same policy enforcement and the same trace_id lineage as everything else.

Architecture

Claude Cowork
     │
     │  1. runs sp-cowork-auth (Credential helper script)
     ▼
sp-cowork-auth  ── probes mTLS → session → PAT, picks the first available
     │
     │  2. POST /v1/auth/cowork/{mtls|session|pat}
     ▼
systemprompt.io gateway
     │
     ├── mints short-lived JWT (UserId, SessionId, TraceId, ClientId, TenantId)
     ├── returns {"token", "ttl", "headers"}
     │
     │  3. Cowork issues /v1/messages with bearer JWT + merged headers
     ▼
/v1/messages middleware
     │
     ├── governance pipeline (scope, secrets, blocklist, rate limit)
     ├── resolve routing rule → provider
     ├── forward to Anthropic | Bedrock | Vertex | Azure | OpenAI | Gemini | Groq
     ├── stream response back
     └── write audit row (prompt, response, tokens, cost) keyed by trace_id

Every inference request carries the seven canonical headers core uses everywhere else (crates/shared/identifiers/src/headers.rs):

Header	Typed ID	Meaning
`x-user-id`	`UserId`	Who made the request
`x-session-id`	`SessionId`	Cowork session this turn belongs to (`sess_<uuid>`)
`x-trace-id`	`TraceId`	Per-request correlation ID; ties inference to tool calls and MCP invocations
`x-client-id`	`ClientId`	Which client emitted the request — `sp_cowork` for a Cowork install, `sp_desktop` / `sp_cli` / `sp_web` otherwise
`x-tenant-id`	`TenantId`	Tenant the user belongs to
`x-call-source`	`SessionSource`	Channel that issued the call (`cowork`, `api`, `cli`, `web`, `oauth`, `mcp`)
`x-policy-version`	(plain header)	Hash of the policy bundle applied at JWT mint time; `unversioned` when no policy bundle has been published

The gateway validates these match the JWT claims before forwarding. The provider never sees them; they are stripped at the outbound boundary.

Endpoints

`POST /v1/messages`

Anthropic-compatible inference endpoint. Accepts the same request body Cowork and Claude Code emit, streams responses via SSE, and preserves anthropic-version, anthropic-beta, and tool-use headers end-to-end.

Auth: Authorization: Bearer <jwt> (minted by the helper chain). Unauthenticated requests return 401.

Routing is resolved via the AI service configuration (services/ai/) — by model, agent, department, region, or custom predicate. See AI Services for provider configuration.

`POST /v1/auth/cowork/pat`

Personal access token exchange. Body: empty. Header: Authorization: Bearer sp_pat_<...>. Returns {"token", "ttl", "headers"}.

Used by sp-cowork-auth's PatProvider. The PAT is long-lived and stored in the platform keystore (macOS Keychain, Windows Credential Manager, Linux Secret Service) or — as a fallback — in ~/.config/systemprompt/cowork-auth.toml.

`POST /v1/auth/cowork/session`

Team-tier exchange. Body: {}. Header: session cookie from a logged-in dashboard browser. Returns the same JSON shape. Short-lived JWT scoped to the user the session belongs to.

`POST /v1/auth/cowork/mtls`

Enterprise-tier exchange. Requires mTLS client certificate (device identity) and an SSO assertion in the body. Returns a JWT scoped to (UserId, SessionId, ClientId, TenantId). The device cert is provisioned via MDM; the SSO assertion comes from the OS-native identity agent (Kerberos, Okta Verify, Azure AD SSO).

`GET /v1/auth/cowork/capabilities`

Unauthenticated probe. Returns {"modes": ["pat", "session", "mtls"]} — the auth modes this gateway accepts. sp-cowork-auth calls this once at install time to validate the gateway URL.

The `sp-cowork-auth` Helper Binary

Cowork's third-party inference panel has a field named Credential helper script — an absolute path to an executable that prints the bearer token (or {"token", "headers"} JSON) to stdout. Cowork caches the result for the TTL, re-invokes on expiry. The spec is strict: stdout must be the credential and nothing else; any banner, prompt, or log line breaks it.

sp-cowork-auth is the universal implementation. One binary, three tiers of identity, selected by capability probe:

mTLS — if a device cert is in the platform keystore, it wins.
Session — if a dashboard session cookie is available, it's next.
PAT — if neither is present, fall back to the personal access token.

Install it once. Credentials are layered on as the deployment matures. Cowork sees the same stdout contract every time.

Install

# Universal installer (macOS / Linux / Windows)
curl -sSf https://systemprompt.io/cowork-auth/install.sh | sh

The installer downloads the signed binary for the current platform, places it at /usr/local/bin/sp-cowork-auth (macOS/Linux) or %ProgramFiles%\systemprompt\sp-cowork-auth.exe (Windows), writes a default config, and opens http://localhost:3000/cowork-auth/setup to mint a credential.

Config

# ~/.config/systemprompt/cowork-auth.toml
gateway_url = "http://localhost:3000"
cache_dir   = "~/Library/Caches/com.systemprompt.cowork-auth"

[pat]
keystore_service = "systemprompt-cowork-pat"

[session]
keystore_service = "systemprompt-cowork-session"

[mtls]
cert_keystore_ref = "systemprompt-cowork-device"
ca_bundle = "/etc/systemprompt/ca.pem"

All three sections are optional. Absent sections are skipped during the probe.

Point Cowork at it

In Cowork: Help → Troubleshooting → Enable Developer mode, then Developer → Configure third-party inference:

Gateway URL: your systemprompt.io base URL (e.g. http://localhost:3000)
Gateway auth scheme: bearer
Credential helper script: the absolute path printed by the installer
Leave static key fields blank. The helper overrides them anyway.
Toggle Skip login-mode chooser on — users should not see Anthropic's sign-in screen.

Provider Routing

The gateway forwards /v1/messages to one of seven providers based on routing rules in services/ai/. Anthropic direct, AWS Bedrock, Google Vertex AI, Azure AI Foundry, OpenAI, Google Gemini, and Groq are all supported.

Routing rules match on:

model — route by requested model name
agent — route per-agent (x-agent-name header)
user / tenant — per-principal routing
cost — cheapest provider for the required model family
failover — primary + secondary per provider group

See AI Services for the full rule schema.

MCP Allowlist for Cowork

Cowork's Extend-on-third-party-platforms model lets admins specify a remote MCP allowlist and tool policies (allow / ask / block). The gateway exposes two routes that feed that allowlist:

GET /v1/cowork/mcp/allowlist — returns the user's scoped MCP servers (from the MCP registry) as a JSON array Cowork merges into its allowlist.
GET /v1/cowork/plugins/manifest — returns signed plugin manifests for the user's entitled set, deposited into Cowork's org-plugins/ mount by the companion sync agent.

Both routes key off the same JWT the helper minted. See MCP Service for registry configuration and Claude Cowork Integration for the gateway + plugin + allowlist story in one place.

Audit Trail

Every /v1/messages request produces one audit row with:

The trace_id (also emitted to Cowork via the stream)
Full prompt and response content (encrypted at rest with the configured key)
Token counts and microdollar cost computed from the provider's pricing
Resolved UserId, SessionId, ClientId, TenantId
The MCP server ID and tool name for any follow-up tool calls, linked by the same trace_id

Forward the JSON stream to Splunk / ELK / Datadog / Sumo Logic via the analytics service, or query it directly with systemprompt analytics requests list.

CLI

# Validate the gateway is reachable and capable
systemprompt gateway capabilities

# Mint a PAT for the current user
systemprompt gateway pat create --name "cowork laptop"

# List PATs
systemprompt gateway pat list

# Revoke a PAT
systemprompt gateway pat revoke <id>

# Install the helper binary locally
systemprompt gateway cowork-auth install

Verified against a live instance

Tested 2026-04-22 against a local gateway at http://localhost:8080 with sp-cowork-auth (debug build, PAT provider). The capability probe returned {"modes":["pat"]}, POST /v1/auth/cowork/pat returned a valid JWT, and the helper's stdout contract held: 937 bytes of JSON, first byte {, empty stderr on the happy path. Cache persisted at ~/.cache/systemprompt-cowork-auth/cache.json (0600) and was served without a network round trip on subsequent invocations. Bad PAT and unreachable gateway both returned exit 5 with 0 stdout bytes and a single-line diagnostic on stderr.

Why It Matters

Architecture

Endpoints

POST /v1/messages

POST /v1/auth/cowork/pat

POST /v1/auth/cowork/session

POST /v1/auth/cowork/mtls

GET /v1/auth/cowork/capabilities

The sp-cowork-auth Helper Binary