Gateway Service
The self-hosted /v1/messages inference gateway and Cowork third-party platform integration. Routes requests across Anthropic, Bedrock, Vertex AI, Azure Foundry, OpenAI, Gemini, and Groq with identity propagation, signed audit trail, and the sp-cowork-auth credential helper.
On this page
TL;DR — systemprompt.io exposes a
/v1/messagesendpoint that is wire-compatible with Anthropic's API, plus a/v1/auth/cowork/*family for Cowork's credential helper script. Point Claude Cowork's third-party inference at it and every prompt, tool call, and cost line lands in your own database. Thesp-cowork-authbinary is the one-click companion that mints credentials for the helper script across personal, team, and enterprise deployments — same binary, three tiers of identity.
Why It Matters
Anthropic documents Claude Cowork running on customer infrastructure in three support articles (14680729, 14680741, 14680753). Inference can route through Bedrock, Vertex AI, Azure Foundry, or any LLM gateway exposing /v1/messages. The last option is what makes identity propagation, cross-provider routing, and audit lineage possible — but only if that gateway exists.
The gateway service is that endpoint. It runs inside the same systemprompt.io binary as the governance pipeline, the MCP registry, and the audit log — so every Cowork request inherits the same policy enforcement and the same trace_id lineage as everything else.
Architecture
Claude Cowork
│
│ 1. runs sp-cowork-auth (Credential helper script)
▼
sp-cowork-auth ── probes mTLS → session → PAT, picks the first available
│
│ 2. POST /v1/auth/cowork/{mtls|session|pat}
▼
systemprompt.io gateway
│
├── mints short-lived JWT (UserId, SessionId, TraceId, ClientId, TenantId)
├── returns {"token", "ttl", "headers"}
│
│ 3. Cowork issues /v1/messages with bearer JWT + merged headers
▼
/v1/messages middleware
│
├── governance pipeline (scope, secrets, blocklist, rate limit)
├── resolve routing rule → provider
├── forward to Anthropic | Bedrock | Vertex | Azure | OpenAI | Gemini | Groq
├── stream response back
└── write audit row (prompt, response, tokens, cost) keyed by trace_id
Every inference request carries the seven canonical headers core uses everywhere else (crates/shared/identifiers/src/headers.rs):
| Header | Typed ID | Meaning |
|---|---|---|
x-user-id |
UserId |
Who made the request |
x-session-id |
SessionId |
Cowork session this turn belongs to (sess_<uuid>) |
x-trace-id |
TraceId |
Per-request correlation ID; ties inference to tool calls and MCP invocations |
x-client-id |
ClientId |
Which client emitted the request — sp_cowork for a Cowork install, sp_desktop / sp_cli / sp_web otherwise |
x-tenant-id |
TenantId |
Tenant the user belongs to |
x-call-source |
SessionSource |
Channel that issued the call (cowork, api, cli, web, oauth, mcp) |
x-policy-version |
(plain header) | Hash of the policy bundle applied at JWT mint time; unversioned when no policy bundle has been published |
The gateway validates these match the JWT claims before forwarding. The provider never sees them; they are stripped at the outbound boundary.
Endpoints
POST /v1/messages
Anthropic-compatible inference endpoint. Accepts the same request body Cowork and Claude Code emit, streams responses via SSE, and preserves anthropic-version, anthropic-beta, and tool-use headers end-to-end.
Auth: Authorization: Bearer <jwt> (minted by the helper chain). Unauthenticated requests return 401.
Routing is resolved via the AI service configuration (services/ai/) — by model, agent, department, region, or custom predicate. See AI Services for provider configuration.
POST /v1/auth/cowork/pat
Personal access token exchange. Body: empty. Header: Authorization: Bearer sp_pat_<...>. Returns {"token", "ttl", "headers"}.
Used by sp-cowork-auth's PatProvider. The PAT is long-lived and stored in the platform keystore (macOS Keychain, Windows Credential Manager, Linux Secret Service) or — as a fallback — in ~/.config/systemprompt/cowork-auth.toml.
POST /v1/auth/cowork/session
Team-tier exchange. Body: {}. Header: session cookie from a logged-in dashboard browser. Returns the same JSON shape. Short-lived JWT scoped to the user the session belongs to.
POST /v1/auth/cowork/mtls
Enterprise-tier exchange. Requires mTLS client certificate (device identity) and an SSO assertion in the body. Returns a JWT scoped to (UserId, SessionId, ClientId, TenantId). The device cert is provisioned via MDM; the SSO assertion comes from the OS-native identity agent (Kerberos, Okta Verify, Azure AD SSO).
GET /v1/auth/cowork/capabilities
Unauthenticated probe. Returns {"modes": ["pat", "session", "mtls"]} — the auth modes this gateway accepts. sp-cowork-auth calls this once at install time to validate the gateway URL.
The sp-cowork-auth Helper Binary
Cowork's third-party inference panel has a field named Credential helper script — an absolute path to an executable that prints the bearer token (or {"token", "headers"} JSON) to stdout. Cowork caches the result for the TTL, re-invokes on expiry. The spec is strict: stdout must be the credential and nothing else; any banner, prompt, or log line breaks it.
sp-cowork-auth is the universal implementation. One binary, three tiers of identity, selected by capability probe:
- mTLS — if a device cert is in the platform keystore, it wins.
- Session — if a dashboard session cookie is available, it's next.
- PAT — if neither is present, fall back to the personal access token.
Install it once. Credentials are layered on as the deployment matures. Cowork sees the same stdout contract every time.
Install
# Universal installer (macOS / Linux / Windows)
curl -sSf https://systemprompt.io/cowork-auth/install.sh | sh
The installer downloads the signed binary for the current platform, places it at /usr/local/bin/sp-cowork-auth (macOS/Linux) or %ProgramFiles%\systemprompt\sp-cowork-auth.exe (Windows), writes a default config, and opens http://localhost:3000/cowork-auth/setup to mint a credential.
Config
# ~/.config/systemprompt/cowork-auth.toml
gateway_url = "http://localhost:3000"
cache_dir = "~/Library/Caches/com.systemprompt.cowork-auth"
[pat]
keystore_service = "systemprompt-cowork-pat"
[session]
keystore_service = "systemprompt-cowork-session"
[mtls]
cert_keystore_ref = "systemprompt-cowork-device"
ca_bundle = "/etc/systemprompt/ca.pem"
All three sections are optional. Absent sections are skipped during the probe.
Point Cowork at it
In Cowork: Help → Troubleshooting → Enable Developer mode, then Developer → Configure third-party inference:
- Gateway URL: your systemprompt.io base URL (e.g.
http://localhost:3000) - Gateway auth scheme:
bearer - Credential helper script: the absolute path printed by the installer
- Leave static key fields blank. The helper overrides them anyway.
- Toggle Skip login-mode chooser on — users should not see Anthropic's sign-in screen.
Provider Routing
The gateway forwards /v1/messages to one of seven providers based on routing rules in services/ai/. Anthropic direct, AWS Bedrock, Google Vertex AI, Azure AI Foundry, OpenAI, Google Gemini, and Groq are all supported.
Routing rules match on:
model— route by requested model nameagent— route per-agent (x-agent-nameheader)user/tenant— per-principal routingcost— cheapest provider for the required model familyfailover— primary + secondary per provider group
See AI Services for the full rule schema.
MCP Allowlist for Cowork
Cowork's Extend-on-third-party-platforms model lets admins specify a remote MCP allowlist and tool policies (allow / ask / block). The gateway exposes two routes that feed that allowlist:
GET /v1/cowork/mcp/allowlist— returns the user's scoped MCP servers (from the MCP registry) as a JSON array Cowork merges into its allowlist.GET /v1/cowork/plugins/manifest— returns signed plugin manifests for the user's entitled set, deposited into Cowork'sorg-plugins/mount by the companion sync agent.
Both routes key off the same JWT the helper minted. See MCP Service for registry configuration and Claude Cowork Integration for the gateway + plugin + allowlist story in one place.
Audit Trail
Every /v1/messages request produces one audit row with:
- The
trace_id(also emitted to Cowork via the stream) - Full prompt and response content (encrypted at rest with the configured key)
- Token counts and microdollar cost computed from the provider's pricing
- Resolved
UserId,SessionId,ClientId,TenantId - The MCP server ID and tool name for any follow-up tool calls, linked by the same
trace_id
Forward the JSON stream to Splunk / ELK / Datadog / Sumo Logic via the analytics service, or query it directly with systemprompt analytics requests list.
CLI
# Validate the gateway is reachable and capable
systemprompt gateway capabilities
# Mint a PAT for the current user
systemprompt gateway pat create --name "cowork laptop"
# List PATs
systemprompt gateway pat list
# Revoke a PAT
systemprompt gateway pat revoke <id>
# Install the helper binary locally
systemprompt gateway cowork-auth install
Verified against a live instance
Tested 2026-04-22 against a local gateway at http://localhost:8080 with sp-cowork-auth (debug build, PAT provider). The capability probe returned {"modes":["pat"]}, POST /v1/auth/cowork/pat returned a valid JWT, and the helper's stdout contract held: 937 bytes of JSON, first byte {, empty stderr on the happy path. Cache persisted at ~/.cache/systemprompt-cowork-auth/cache.json (0600) and was served without a network round trip on subsequent invocations. Bad PAT and unreachable gateway both returned exit 5 with 0 stdout bytes and a single-line diagnostic on stderr.