Claude Cowork Deployment
Platform-agnostic reference for deploying Claude Cowork against a self-hosted /v1/messages gateway. Covers the five-move architecture, three auth tiers (PAT, session, mTLS), the signed manifest, the audit schema, and routes to platform-specific install guides for macOS and Windows.
On this page
TL;DR - Claude Cowork reads five managed-preference keys on launch that let an organisation replace Anthropic's hosted inference, auth, and plugin catalogue with its own. This page is the platform-agnostic reference: architecture, auth tiers, manifest shape, audit schema. For the platform mechanics see macOS and Windows. For the strategic case a CTO/CISO makes for this deployment, see the Cowork on your own infrastructure guide.
Anthropic's three extension points
Anthropic documents Cowork running on customer infrastructure in three support articles:
- 14680729 - Use Claude Cowork with third-party platforms - what the contract is
- 14680741 - Install and configure Claude Cowork with third-party platforms - how a client attaches to a gateway
- 14680753 - Extend Claude Cowork with third-party platforms - plugin, skill, and MCP extension surfaces
Three extension points come out of those articles:
/v1/messagesgateway. WheninferenceProvider = "gateway"is set in managed preferences, Cowork routes every chat turn, tool call, and extended-thinking block throughinferenceGatewayBaseUrl. The URL must speak the Anthropic Messages wire format.- Managed MCP allowlist. Cowork reads a JSON file under the
org-pluginsmount to decide which MCP servers a user may invoke. Servers not in the list are hidden from the UI and refused at runtime. org-pluginsmount. A directory Cowork scans on launch. Plugins, skills, and agents inside it are visible in the slash menu and the agent runtime. End users have no write access.
The five-move deployment
Five moves take a fleet from zero to a Cowork deployment that runs entirely on infrastructure you operate. The first three run once per organisation; the last two run per device.
- Publish the gateway. Stand up the
/v1/messagesendpoint on a host the fleet can reach. See the Gateway Service reference for endpoint and routing details. - Mint the credential. Pick the auth tier that matches your compliance posture - PAT, session cookie, or mTLS. The gateway's
/v1/auth/cowork/capabilitiesendpoint advertises which tiers it accepts. - Ship the signed manifest. Publish the plugins, skills, and managed MCP servers the fleet should receive, signed with an Ed25519 private key held by the gateway.
- Distribute the MDM profile. Push the five managed-preference keys to every device. Platform specifics: macOS, Windows.
- Verify in the audit trail. One SQL query against
audit_eventsjoining prompt, tool calls, and MCP execs bytrace_id. Empty result set means the deployment is not finished.
The five managed-preference keys
Cowork reads the same five keys on every platform. The meaning is identical; the transport differs.
| Key | Values | Purpose |
|---|---|---|
inferenceProvider |
"gateway" |
Switches inference from api.anthropic.com to the configured gateway |
inferenceGatewayBaseUrl |
"https://cowork-gateway.example.com" |
Where /v1/messages calls go. HTTPS required unless host is 127.0.0.1 |
inferenceCredentialHelper |
absolute path | Path to the sp-cowork-auth / systemprompt-cowork binary Cowork invokes on every call |
inferenceCredentialHelperTtlSec |
integer seconds (e.g. 3600) |
How long a minted JWT is cached on disk before the helper re-authenticates |
inferenceGatewayAuthScheme |
"bearer" |
Auth scheme on outbound calls. bearer is the only value today |
Platform-specific syntax lives in the platform docs. Linux developer workstations read the same values from CLAUDE_* environment variables.
The three auth tiers
sp-cowork-auth exposes three authentication providers. On every call they run in a fixed order - mTLS first, session cookie second, PAT third. The first provider that produces a valid token wins.
+------------------------------------------------+
| sp-cowork-auth on each inference call |
+------------------------------------------------+
|
cache hit? | yes -> return cached line
|
v no
+------------------------------------------------+
| Tier 1: mTLS device certificate |
| POST /v1/auth/cowork/mtls |
+------------------------------------------------+
| no cert / disabled
v
+------------------------------------------------+
| Tier 2: Session cookie (browser OAuth) |
| POST /v1/auth/cowork/session |
+------------------------------------------------+
| no session / 401
v
+------------------------------------------------+
| Tier 3: Personal Access Token |
| POST /v1/auth/cowork/pat |
+------------------------------------------------+
|
v
Write {token, ttl, headers} JSON to stdout
PAT is the simplest tier. An administrator mints a token at the gateway, the user runs sp-cowork-auth login sp_pat_... once, and the token lives in the OS keystore (macOS Keychain, Windows Credential Manager, Linux Secret Service). Revoke a PAT at the gateway and the next helper call returns 401.
Session cookie is right when you already have a browser-based SSO flow and want Cowork enrolment to ride on it. The helper opens {gateway}/cowork/device-link?redirect=http://127.0.0.1:<ephemeral>, the user authenticates against the IdP, the browser returns the code to the loopback server, and a session cookie lands in the OS keystore.
mTLS is the right tier when the device itself is the identity. The helper loads a client certificate from an OS keystore reference, presents it to /v1/auth/cowork/mtls, and receives a JWT carrying the certificate's SHA-256 fingerprint as a claim. The cert is typically provisioned by MDM and backed by TPM (Windows), Secure Enclave (macOS), or a PKCS#11 token (Linux).
The seven canonical headers
Every /v1/messages call that leaves Cowork carries seven headers the gateway validates against the JWT and stamps into audit_events:
| Header | Typed ID | Meaning |
|---|---|---|
x-user-id |
UserId |
The human on the other side of the keyboard |
x-session-id |
SessionId |
Cowork chat session - stable across tool calls within one conversation |
x-trace-id |
TraceId |
This one inference call. Every tool call and MCP exec inherits it |
x-client-id |
ClientId |
sp_cowork for Cowork; sp_cli / sp_desktop / sp_web elsewhere |
x-tenant-id |
TenantId |
Tenant the user belongs to |
x-policy-version |
(plain) | Hash of the policy bundle in force at JWT mint time |
x-call-source |
SessionSource |
Module that issued the call (cowork, subagent, job) |
The header constants live in systemprompt_identifiers::headers. The provider never sees them; the gateway strips them at the outbound boundary.
The stdout contract
inferenceCredentialHelper is strict: on every run the binary prints exactly one JSON line to stdout. Anything else - a banner, a log preamble, two lines - breaks Cowork's parser and surfaces as "credential helper failed" in the UI. Diagnostics go to stderr.
{
"token": "eyJhbGciOi...",
"ttl": 3600,
"headers": {
"x-user-id": "u_29f8a3",
"x-session-id": "s_01hy...",
"x-trace-id": "t_a8bf...",
"x-client-id": "sp_cowork",
"x-tenant-id": "org_acme",
"x-policy-version": "2026-04-22",
"x-call-source": "cowork"
}
}
Cached at $XDG_CACHE_HOME/systemprompt/systemprompt-cowork.json (mode 0600). Cache invalidation is a matter of deleting the file.
The signed manifest
The manifest is a JSON document served by the gateway at /v1/cowork/manifest and signed with an Ed25519 private key per RFC 8032. Every client verifies the signature against a pubkey pinned at install time.
{
"version": "2026-04-22T09:30:00Z",
"user": {
"id": "u_29f8a3",
"roles": ["engineering", "senior"]
},
"plugins": [
{
"id": "devops-plugin",
"version": "1.4.2",
"files": [
{"path": "plugin.toml", "sha256": "..."},
{"path": "handlers/deploy.sh", "sha256": "..."}
]
}
],
"skills": [
{
"name": "review-terraform-plan",
"description": "Audit a Terraform plan for destructive changes",
"file_path": ".systemprompt-cowork/skills/review-terraform-plan.md",
"instructions": "..."
}
],
"agents": [
{
"id": "pr-reviewer",
"endpoint": "/v1/messages",
"model": "claude-sonnet-4-6",
"skills": ["review-terraform-plan"],
"mcp_servers": ["gh-readonly"],
"enabled": true
}
],
"managed_mcp_servers": [
{
"name": "gh-readonly",
"url": "https://mcp-internal.example.com/gh-readonly",
"tool_policy": {"allow": ["search_code", "read_file"]}
}
],
"revocations": [
{"kind": "skill", "name": "leaked-api-probe"}
],
"signature": {
"alg": "ed25519",
"sig": "base64(ed25519(canonical_json(manifest_body)))"
}
}
Five details carry the governance story:
- Per-user. The
userblock is resolved server-side from the JWT; two engineers in the same tenant can receive different manifests. managed_mcp_serversis the allowlist. Servers not in the list are refused at runtime.revocationsis the kill switch. The binary removes revoked files atomically on the next sync.- Skills and plugins are separate. Plugins execute; skills are text context. Agents tie them together.
- Signature covers canonical JSON. Pubkey is pinned at
install --gatewaytime; mismatched signature aborts the sync before touching the filesystem.
The sync flow
- Fetch
/v1/cowork/manifestwith the cached JWT. A 401/403/404 short-circuits the sync; the existing mount is untouched. - Verify the Ed25519 signature against the pinned pubkey. Verification failure aborts before any filesystem write.
- Stage every file referenced in
plugins[].files[]underorg-plugins/.staging/. Each file is SHA-256-hashed and compared to the manifest. Mismatch aborts. - Rename atomically into place (
MoveFileEx(MOVEFILE_REPLACE_EXISTING | MOVEFILE_WRITE_THROUGH)on Windows;rename(2)on Unix). - Write
.systemprompt-cowork/managed-mcp.jsonand.systemprompt-cowork/last-sync.json.
Sign the manifest from a build pipeline that retains every version, and keep the Ed25519 signing key in an HSM, YubiHSM, or cloud KMS - never on an admin laptop.
The audit schema
Every pass through the governance pipeline writes one row to audit_events before the response returns to the caller, so there is no path where a successful call is not audited.
CREATE TABLE audit_events (
id BIGSERIAL PRIMARY KEY,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT now(),
kind TEXT NOT NULL, -- 'inference' | 'tool_call' | 'mcp_exec' | 'cost'
user_id TEXT NOT NULL, -- from x-user-id
session_id TEXT NOT NULL, -- from x-session-id
trace_id TEXT NOT NULL, -- from x-trace-id
client_id TEXT NOT NULL, -- from x-client-id
tenant_id TEXT NOT NULL, -- from x-tenant-id
policy_ver TEXT NOT NULL, -- from x-policy-version
call_source TEXT NOT NULL, -- from x-call-source
model TEXT,
provider TEXT, -- 'anthropic', 'openai', 'bedrock', ...
tokens_in INTEGER,
tokens_out INTEGER,
cost_micro BIGINT, -- microdollars
latency_ms INTEGER,
outcome TEXT NOT NULL, -- 'allowed' | 'denied' | 'error'
payload JSONB NOT NULL
);
CREATE INDEX ON audit_events (tenant_id, occurred_at DESC);
CREATE INDEX ON audit_events (trace_id);
CREATE INDEX ON audit_events (user_id, occurred_at DESC);
The one-shot lineage query
Every event triggered by one user prompt shares a trace_id. A single join on trace_id gives the full lineage:
SELECT
e.occurred_at, e.kind, e.outcome, e.model, e.provider,
e.tokens_in, e.tokens_out, e.cost_micro,
e.payload ->> 'tool' AS tool_name,
e.payload ->> 'server' AS mcp_server
FROM audit_events e
WHERE e.tenant_id = 'org_acme'
AND e.user_id = 'u_29f8a3'
AND e.trace_id = 't_a8bf...'
ORDER BY e.occurred_at ASC;
Returns a chronological list: one inference row, zero or more tool_call rows, zero or more mcp_exec rows, and a final cost row. Every row is tied back to the JWT-verified user, tenant, and policy version.
SIEM export
Every row is also emitted as a JSON event on a separate topic for Splunk / ELK / Datadog / Sumo Logic ingestion. The shape mirrors the table row.
Cost attribution
cost_micro is microdollars. SUM(cost_micro) GROUP BY user_id is per-user cost. GROUP BY (tenant_id, provider) compares spend across upstreams - "what did we spend on Bedrock vs direct Anthropic in April 2026" is a one-line query.
Provider routing
The gateway's /v1/messages router forwards to one of the built-in provider tags (anthropic, openai, moonshot, qwen, minimax, gemini, bedrock, vertex, azure, groq) or a custom tag registered via the GatewayUpstream trait and the inventory crate. Routes match on model_pattern, agent, user/tenant, cost, or failover. See the AI Services reference for the full rule schema and the Gateway Service reference for the route config block.
Air-gapped deployment
Nothing in the runtime flow requires outbound traffic to anthropic.com, a telemetry endpoint, or a licence server. The only network calls the binary makes are the ones explicitly pointed at your gateway.
An air-gapped deployment shifts three responsibilities inward:
- The gateway runs inside the egress boundary.
- The upstream provider sits on the same side - self-hosted vLLM / Ollama / sglang, a private Bedrock VPC endpoint, or Azure OpenAI over private link.
- The Ed25519 signing key and its HSM never leave.
For pure-offline deployments where clients have no network path to any gateway, the sync agent supports pre-seeded mode: the admin produces a signed manifest and plugin payload on a build machine, copies to a read-only share, and points the binary at a file:// URL. Signature verification still runs; there is no trust shortcut for local files.
Migration from direct API usage
Teams running Cowork against api.anthropic.com with individual OAuth need a soft cutover in four evidence-gated phases:
- Shadow mode. The gateway accepts requests and writes to
audit_events; upstream of record stays Anthropic. Advance when shadow audit matches production call shape. - Pilot cohort. MDM profile to a small volunteer group. Advance when the cohort reports no regressions and
outcome = 'allowed'distribution looks right. - Ring rollout. Push department by department. Spike in
outcome = 'error'rows in any ring blocks the next ring until triaged. - Retire direct access. Revoke individual OAuth tokens at the Anthropic side. Gateway is the only path.
The gate between phases is evidence, not calendar.
CLI
# Probe the gateway - what auth tiers does it accept?
systemprompt gateway capabilities
# Mint a PAT for the current user
systemprompt gateway pat create --name "cowork laptop"
# List PATs
systemprompt gateway pat list
# Revoke a PAT (takes effect on next helper call, i.e. within one TTL window)
systemprompt gateway pat revoke <id>
# Install the helper locally
systemprompt gateway cowork-auth install
Where to go next
- Install on Mac: Cowork on macOS
- Install on Windows: Cowork on Windows
- Endpoint reference: Gateway Service
- Strategic case for this deployment: Cowork on your own infrastructure