ANY AI AGENT. ONE RBAC MIDDLEWARE ACROSS EVERY PROVIDER.
One AiProvider trait fronts your own inference cluster, self-hosted open-weight models, Anthropic, OpenAI, and Gemini. RBAC, audit_events, and microdollar cost stay identical whichever upstream answers.
One AiProvider Trait
Swapping Claude for Gemini, fronting an internal Llama deployment, or adding OpenAI for one workflow usually breaks the governance layer first. SDKs differ. Token counters differ. Audit schemas differ. A month of security work gets discarded for what should have been a config change. systemprompt.io fixes the upstream as the variable and the governance as the constant.
Every upstream in the codebase implements the AiProvider trait. The trait defines the same methods whether it fronts your own inference cluster, a self-hosted open-weight model, Anthropic, OpenAI, or Gemini (generate text, generate with tools, stream, generate against a JSON schema, return per-model pricing, declare capabilities, plus the rest). A call that hit Anthropic yesterday and your internal Qwen today writes the same audit_events row with the same trace_id, because the binary only ever speaks to the trait.
Provider selection happens once at startup, driven by YAML. A factory reads the provider config block and hands the binary an object conforming to the trait. No vendor SDK leaks into the rest of the code, so the same RBAC middleware gates every tool call regardless of which upstream answers. JWT, scope, and audience checks apply identically. A migration between upstreams is not a compliance re-review.
- Same trait for self-hosted and SaaS — Your own inference cluster, a self-hosted Llama or Qwen, Anthropic, OpenAI, and Gemini all implement one AiProvider trait. A new upstream is a new implementation of the same shape, not a new governance layer.
- Provider is a YAML block — Enable an upstream by adding a config block with enabled, api_key, endpoint, and default_model. The factory routes by name at startup, so an evaluation is a config change, not a project.
- One audit_events row shape — Every request writes a record with provider, model, cost_microdollars, input and output token counts, latency, trace_id, session_id. Auditors read one table for every upstream. SIEM queries survive provider migrations.
- provider_trait.rs (lines 141-261) AiProvider trait with 19 methods every provider implements.
- provider_factory.rs ProviderFactory::create() instantiates providers from AiProviderConfig.
- ai_request_record.rs AiRequestRecord carries provider, model, cost, tokens, latency, trace id.
- rbac.rs enforce_rbac_from_registry gates every tool call regardless of provider.
- request_logging.rs Structured request logging that feeds the shared audit schema.
Provider Is A YAML Block
A team shopping for a cheaper model, or an infra lead moving inference in-house, should not rewrite the control plane. In a typical stack a new upstream means a new SDK, a new audit adapter, a new rate limiter, a new cost model. Each is an engineering project and a compliance surface. systemprompt.io makes the upstream a compile-time module behind the shared trait.
Adding an upstream is three steps. Implement the AiProvider trait for the new target (your inference cluster, an open-weight server, a new SaaS). Add a provider block to the profile YAML with api_key, endpoint, default_model, and per-model pricing. Restart. The factory routes by name. Governance applies automatically because governance lives in the middleware in front of every request, not inside the provider.
The same config shape supports per-environment routing, so dev, staging, and production point at different upstreams without a code change. A staff engineer verifying this reads the trait definition, the factory, and the three in-tree implementations (Anthropic, OpenAI, Gemini) to model a fourth. Building this in-house means one adapter per vendor plus a governance harness. systemprompt.io ships the harness and three worked examples.
- Three steps to a new upstream — Implement the AiProvider trait. Add the config block. Restart. An upstream evaluation stays inside a compile window instead of a multi-week integration.
- Per-environment routing — The provider config takes enabled, api_key, endpoint, default_model, and per-model pricing. Dev points at a test endpoint while production uses your inference cluster. One profile flag replaces parallel governance stacks.
- Middleware owns governance — Every MCP tool call passes the RBAC middleware before reaching a provider. JWT claims, OAuth2 scopes, and audience validation apply identically whether the upstream is your own cluster or a SaaS model. No provider gets a bypass route.
- provider_trait.rs (lines 141-261) AiProvider trait definition every new provider implements.
- ai.rs AiProviderConfig and ModelDefinition with per-model pricing and limits.
- rbac.rs RBAC middleware that runs before any provider is invoked.
- anthropic/ Anthropic provider, a worked example of the shared trait.
- openai/ OpenAI provider, a second worked example.
- gemini/ Gemini provider, a third worked example.
- profile/mod.rs Profile configuration where provider blocks live.
Registry Inheritance
Most organisations run Claude Code alongside Cursor, a custom CLI the platform team built, and a half-finished internal agent someone is wiring up this week. Each surface either re-implements governance or inherits it. systemprompt.io is built for the second answer.
Any HTTP client that reaches the MCP endpoints passes the same middleware as Claude Desktop. Cursor, Windsurf, VS Code with Copilot, a bespoke CLI, and a custom agent process all hit the same JWT validation, scope check, and audit row. A staff engineer verifying this reads the agent registry module and the RBAC middleware that gates every handler.
Agents register through the shared registry, which holds configuration, lifecycle state, and port allocation. A custom agent added to the registry inherits RBAC, rate limits, secret scanning, and the audit pipeline without writing any of them. The evidence a CISO cares about (who called what, when, under what permission, against which upstream) lands in the same tables whether the caller is a vendor tool or internal code shipped last week.
- Custom agents ride shared rails — A registered agent passes through the same RBAC, rate limiting, and audit logging as Claude. Internal agents inherit the compliance boundary instead of opening a new audit gap.
- A2A discovery by agent card — Each agent publishes an agent card describing capabilities, skills, and auth requirements. Agents find each other through a registered discovery endpoint, so multi-agent workflows stay inside one audit surface.
- One enforcement point, every client — Claude Desktop via Cowork, Cursor, Windsurf, VS Code, and any custom CLI reach handlers through the same RBAC middleware. Governance is not whatever each tool remembered to implement.
- registry/mod.rs AgentRegistry with get_agent and list_enabled_agents for discovery.
- a2a/mod.rs A2A models, AgentCard, AgentSkill, Task, and TaskState.
- handlers/card.rs handle_agent_card endpoint that serves discovery data.
- agent_orchestration/ Agent lifecycle, health monitoring, and reconciler.
- rbac.rs Governance middleware every HTTP client inherits.
A2A Through The Same Gate
Multi-agent workflows routinely turn governance into swiss cheese. Agent A calls Agent B, Agent B calls an MCP tool, and somewhere in the chain a scope check gets skipped because the caller is another agent on the same host. Auditors find out months later. systemprompt.io treats an agent-to-agent call as another HTTP request that passes the same middleware.
Discovery starts with the registry. The registry lists every active agent and builds a discovery payload covering capabilities, skills, security scheme, and transport. Agents register at startup and deregister on shutdown through the orchestrator. The contract is one JSON endpoint per agent. If an agent is reachable, it is on the registry and auditable.
Delegation passes the same gate as any other call. When Agent A sends a task to Agent B, the request hits the RBAC middleware at the receiving server. JWT claims, OAuth2 scopes, and audience validation run identically to a user request. The A2A request envelope wraps message parameters and task identifiers. The task state machine tracks execution across named states (submitted, working, input-required, completed, failed, canceled), so a delegated task writes the same row shape as a direct call. "Did Agent A have the scope to ask Agent B to call that tool" is one query against the audit table.
- Registry is the discovery surface — Every reachable agent and its security scheme publish through the registry. Coordination happens through the registered endpoint, so audit does not fragment across ad-hoc wiring.
- Delegation passes the same gate — An agent-to-agent call hits the RBAC middleware like a user call. Scope, audience, and JWT checks apply the same way. Multi-agent chains cannot become a permission laundering route.
- One task state machine — The A2A task state machine fits inter-agent work into one row shape. One audit query covers the whole delegation chain instead of a format per flow.
- a2a/mod.rs AgentCard, AgentCapabilities, TaskState, and the A2A request envelope.
- a2a_server/ A2A server with handlers, auth, and streaming.
- registry/mod.rs AgentRegistry discovery and list_enabled_agents.
- handlers/card.rs Agent card endpoint that publishes the security scheme.
- auth/ A2A authentication middleware.
- rbac.rs RBAC enforcement for inter-agent requests.
Microdollar Ledger
Cross-upstream spend is where finance loses the plot. An Anthropic dashboard does not talk to an OpenAI one, which does not talk to a Gemini one, which definitely does not talk to an internal inference cluster's billing. Month-end reconciliation becomes a spreadsheet someone rebuilds by hand. systemprompt.io writes every request into one PostgreSQL table with the same shape whatever the upstream.
One row per request. The record carries provider, model, cost_microdollars as a 64-bit integer so rounding cannot accumulate, input and output token counts, latency, user_id, session_id, task_id, and trace_id. Cost comes from the per-model pricing the provider trait returns, applied against token counts the upstream itself reported. Finance reads real numbers rather than estimates rounded to the nearest cent.
Three breakdowns over one query surface. The analytics repository exposes cost by model, by provider, and by agent, plus a time-series view, a summary endpoint, and a period-over-period comparison. Finance sees spend by upstream, by model, and by agent from one table. A CTO answers "what does this team's AI budget look like across every upstream" with one query instead of three exports and a reconciliation step.
- Microdollar precision — cost_microdollars is a 64-bit integer, so totals do not drift. Token counts come from the upstream response, not estimates. Cost reports carry no reconciliation error.
- Model, provider, agent, one table — Three breakdown methods over the same request table. Finance sees upstream spend. The CTO sees agent spend. The model owner sees model spend. One ledger feeds every view.
- Trends in Postgres — Trend, summary, and period-over-period views run against PostgreSQL. Period comparisons are a SQL query, so cost observability stays inside the binary you already operate.
- costs.rs (lines 63-150) CostAnalyticsRepository with breakdowns by model, provider, and agent.
- ai_request_record.rs AiRequestRecord with cost_microdollars as i64 and typed token counts.
- provider_trait.rs (lines 13-26) ModelPricing struct and get_pricing() method on the provider trait.
- request_storage/ Async request persistence into the shared ai_requests table.
- request_logging.rs Structured request and cost logging.
- ai.rs ModelDefinition with per-model pricing configuration.
Founder-led. Self-service first.
No sales team. No demo theatre. The template is free to evaluate — if it solves your problem, we talk.
Who we are
One founder, one binary, full IP ownership. Every line of Rust, every governance rule, every MCP integration — written in-house. Two years of building AI governance infrastructure from first principles. No venture capital dictating roadmap. No advisory board approving features.
How to engage
Evaluate
Clone the template from GitHub. Run it locally with Docker or compile from source. Full governance pipeline.
Talk
Once you have seen the governance pipeline running, book a meeting to discuss your specific requirements — technical implementation, enterprise licensing, or custom integrations.
Deploy
The binary and extension code run on your infrastructure. Perpetual licence, source-available under BSL-1.1, with support and update agreements tailored to your compliance requirements.
Keep your governance when your upstream changes.
Your own inference cluster, a self-hosted Llama or Qwen, Anthropic, OpenAI, and Gemini share one AiProvider trait. Enabling a new upstream is a YAML block, not a replatforming. The audit_events table, RBAC rules, and cost reports stay identical.