Skip to main content

SECRETS MANAGEMENT. LAST MILE DELIVERY, NEVER IN THE PROMPT.

Provider credentials reach the MCP subprocess as environment variables at spawn. ANTHROPIC_API_KEY, OPENAI_API_KEY, database URLs and OAuth tokens never enter a prompt, a completion, a tool argument, or an audit row.

Last Mile Secrets Delivery

In logistics, the last mile is the leg where a package actually reaches the customer, the failure-prone final delivery. For an AI agent, the last mile is the tool call. The credential has to reach the downstream API without ever entering a prompt, a completion, a tool argument, or an audit row. That is what this section handles for a Claude agent.

When a Claude agent calls a tool, the system launches the tool as a separate subprocess and hands the provider credentials (ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, GITHUB_TOKEN) to it as environment variables. The credential lives inside that subprocess, outside the model's view. The agent names the tool, the tool returns a result, the key never appears in between.

Custom credentials travel the same path. User-supplied secrets are handed to the subprocess as environment variables. An explicit allowlist named SYSTEMPROMPT_CUSTOM_SECRETS tells the subprocess which variables it is authorised to read. The tool execution log records tool name, server name, input arguments, status, and an execution id, and the credential is deliberately absent from every column.

  • Subprocess-Bound Credentials — Provider keys live inside the tool subprocess, not in the prompt the model builds from. The model receives the tool result, never the key used to authenticate the call.
  • Custom Secrets Without a Rebuild — User-supplied credentials reach the subprocess as environment variables under a published allowlist. New credentials ship as config. No host binary rebuild, no engineering release in the critical path.
  • Audit Without Disclosure — The tool execution log records tool name, server name, input arguments, status, and execution id. The credential is deliberately outside the schema, so the audit table that proves which user called which tool cannot also leak the key they used.

HS256 JWT Sessions

A stolen session cookie should not become a long-lived API key. Every systemprompt.io session is a JWT the server verifies locally in under a microsecond, without a lookup to an identity provider. The token is the session, not a pointer to a session stored somewhere else.

Every token is HS256-signed with a profile-scoped secret the deployment owns. Startup refuses to boot with a signing secret shorter than the configured minimum, so a misconfigured profile fails loud instead of silently issuing weak tokens. The claims payload carries the user's scope, roles, session_id, rate-limit tier, audience, issuer, and expiry, plus a unique jti per session. Handlers read typed fields directly. Nothing re-parses the raw token, and nothing queries a user table on every request.

Verification has three explicit modes. Required is the production default: extract the bearer, decode the HS256 claims, validate issuer and audience, check expiry against wall-clock, return 401 on any failure. Optional falls back to an anonymous request context so public endpoints stay reachable. Disabled is reserved for tests. A network partition between the API and an external auth service cannot cause spurious logouts, because the server is the auth service.

  • Local Verification — HS256 signing lets the server validate tokens against an in-memory secret. No round-trip to an identity provider, no external network hop on the request path.
  • Three Modes, One Validator — Required is fail-fast. Optional falls back to anonymous. Disabled is test only. The same validator handles all three via a mode flag, so test infrastructure never diverges from production.
  • Typed Claims, No DB Round-Trip — Every token carries scope, user_type, roles, session_id, and rate_limit_tier as typed fields. Handlers read claims directly rather than re-querying user and role tables on every request.

Scanner Detection at the Edge

Credential theft almost always starts with a probe. Mass scanners sweep the internet looking for exposed .env files, leftover .git directories, phpMyAdmin installs, and known webshell paths. Finding one is the attacker's first move. The scanner detector rejects those requests inline, before any application handler receives them, turning the outer edge of your deployment into a triage layer.

The detector inspects three signals in order. A requested path is matched against tables of scanner extensions and known-bad paths, covering .env, .git, phpMyAdmin, and the webshell endpoints mass scanners walk for. The user-agent is matched against a list of named scanner tools (masscan, nmap, nikto, sqlmap, and more) plus length thresholds that block empty or obviously forged agents. Request rate is normalised to requests-per-minute and compared against a burst ceiling sized to catch runaway scripts without throttling normal traffic.

All three checks run inline on the request path, not as post-hoc log analysis. A positive signal on any check rejects the request before routing. The detector emits a structured denial event with the path, user-agent, velocity, and which check fired, so a SIEM can tune thresholds from real traffic rather than guesswork.

  • Path Table at the Door — Scanner extensions and known-bad paths are matched against a fixed table before any handler runs. Application handlers never see the garbage traffic.
  • User-Agent Triage — Named scanner tools, overly short agents, and stale browser versions are all rejected. Generic HTTP clients are rate-gated on per-client length thresholds. A new scanner family is a config update, not a code change.
  • Velocity as a Signal — Request count over duration is normalised to requests-per-minute and compared against a burst ceiling. A script hammering the endpoint trips the limit regardless of path or agent, so slow-and-low probes cannot survive pattern matching alone.

Transport-Agnostic Bearer

A browser sends a session in a cookie. A CLI sends a bearer token in the Authorization header. A trusted MCP proxy in front of the server sends a verified identity in a custom header. A single hardcoded extraction strategy forces every caller to re-architect for the strategy. A transport-agnostic bearer chain lets every caller use its native idiom without the server caring which transport carried the token.

The token extractor holds an ordered list of extraction methods and walks them on every request. The first method that returns a token wins. If every method fails, the request is rejected with a specific error naming the methods tried. Preset chains cover the common cases. Standard walks the Authorization header, then MCP proxy, then cookie. Browser-only drops MCP proxy. Api-only keeps the Authorization header alone, for strict machine-to-machine traffic.

The MCP middleware adds one more layer on top of the extractor. Proxy-verified identity is tried first. A corporate load balancer doing SSO termination can sign a verified identity into a custom header, and the middleware accepts it without needing to speak JWT at the proxy. If proxy auth is absent, the middleware falls back to direct bearer validation, running audience, scope, and permission checks in turn. The result is either a fully authenticated request context or an anonymous one, with no ambiguous middle state.

  • Ordered Method Chain — Authorization header, MCP proxy header, cookie, walked in explicit order, first match wins. Configuration happens once, not in every handler, so the three transports cannot drift apart.
  • Proxy-Verified Auth First — A signed proxy header is validated ahead of direct bearer tokens. An SSO-terminating edge proxy does not need to embed JWT logic, so TLS-terminating proxies stay simple.
  • Named Failure Modes — Token extraction fails with a specific error naming the method that missed, from missing header to malformed cookie. Callers see a specific failure, not a generic 401, so auth debugging stays open rather than a black box.

Permission-Bound Scope

When a single credential is compromised, the question is what the attacker can reach with it. In a flat credential model, the answer is everything. When every credential's scope is bound to a typed identity, the answer is bounded at the tier the identity holds. Every session's reach is defined by the identity that issued it, and no identity can resolve to more than its tier allows.

The permission model tiers are Admin, User, Anonymous, A2a, Mcp, and Service. Every session's JWT carries the tier at issuance. The tier determines which MCP servers the caller can discover, which tools load into the session, and which rate limit applies. Loading tools for a caller runs a permission check against each server before any tool appears in the manifest. Servers the caller cannot access are silently dropped, so a compromised Anonymous token cannot even enumerate admin-only tools, let alone invoke them.

Rate-limit tier follows the same identity. Admin sessions get a higher multiplier on base rates because they operate at scale with consent. Anonymous sessions get a fractional multiplier because a compromised anonymous token should not exhaust capacity. Secret scope, permission tier, and rate tier all key off the same authenticated identity, and the audit trail binds each tool execution to the request context that authorised it. The question "what did this identity touch" resolves to a single indexed query, not a cross-system hunt.

  • Typed Secret Storage — The secret store holds the JWT signing secret, the database URL, and named provider keys alongside a dictionary of user-defined credentials. Startup validation rejects a profile that boots with a weak JWT secret, so secrets are never just strings with no boot-time guard.
  • Permission Tiers — Tiers named Admin, User, Anonymous, A2a, Mcp, and Service, each an enforcement boundary. The session's JWT carries typed fields handlers read for authorisation decisions without a database lookup. A compromised low-tier session cannot enumerate admin tools by name.
  • Server Scoping at Load Time — Before MCP tools load for a caller, a permission check runs against the caller's tier for each server. Servers the caller cannot access are dropped from the manifest, so tools outside your scope are not discoverable by name.

Founder-led. Self-service first.

No sales team. No demo theatre. The template is free to evaluate — if it solves your problem, we talk.

Who we are

One founder, one binary, full IP ownership. Every line of Rust, every governance rule, every MCP integration — written in-house. Two years of building AI governance infrastructure from first principles. No venture capital dictating roadmap. No advisory board approving features.

How to engage

Ready to build?

Get started with systemprompt.io in minutes.