Skip to main content

ONE BINARY. ONE DATABASE.

A single Rust binary plus PostgreSQL is the entire stack. No Redis, no Kafka, no Elasticsearch, no sidecars. The same artifact runs in Docker, on bare metal, in the cloud, or behind an air gap. Copy it anywhere Linux runs.

50MB Binary, PostgreSQL, Nothing Else

An ops team that wants to run governed Claude in production usually inherits a stack: a web tier, a session cache, a rate-limit store, a job runner, a search index, a log shipper, an identity service, an orchestrator. Each one is a deploy, an upgrade path, and an on-call surface. systemprompt.io collapses that stack into one process.

The HTTP server (Axum), the job scheduler (tokio-cron-scheduler), the template engine, the JWT middleware, the tiered rate limiter (TieredRateLimiter, in-process via the governor crate), analytics, and cost tracking all link into the same Rust binary. PostgreSQL is the only runtime dependency. Sessions live in the database. Rate-limit state lives in process memory. There is no Redis, no Kafka, no Elasticsearch in the dependency graph.

The application entry point is nine lines. main.rs calls __force_extension_link() to keep the linker from stripping inventory registrations under LTO, then hands off to cli::run(). Extensions, routes, schemas, jobs, health checks, and the CLI come up from that single call.

  • Rate limiting in process — TieredRateLimiter holds keyed rate limiters from the governor crate, one per auth tier. State lives in process memory. No Redis cluster to operate.
  • No sidecars in the deployment — No service mesh, no proxy container, no init container, no log forwarder. The job scheduler runs in process via tokio-cron-scheduler. One process per replica is the unit of deployment.
  • Nine-line entry point — main.rs calls __force_extension_link() to keep inventory registrations alive under LTO, then dispatches to cli::run(). Bootstrap is one function call.

Extensions Compiled In, Not Bolted On

A team adding a custom capability to a typical AI stack writes a separate service, a Dockerfile, a deploy pipeline, and an interface contract, and then carries the contract drift forever. systemprompt.io takes the other path: a custom capability is a Rust crate that links into the same binary as everything else.

An extension is a type that implements the Extension trait (around thirty methods covering schemas, jobs, routes, providers, renderers, and dependencies) and registers itself with the register_extension! macro. The macro expands to inventory::submit!, which writes an ExtensionRegistration factory into a linker section at compile time. At startup, ExtensionRegistry::discover() walks inventory::iter::<ExtensionRegistration>, calls each factory, and inserts the resulting Arc<dyn Extension> into the registry. There is no classpath scan, no dynamic loader, and no host/plugin ABI to version.

The same pattern is what ships in the box. The template links three library extensions into the binary: Web, Marketplace, and Email. The host provides the infrastructure extensions for database, logging, analytics, files, users, AI, MCP, OAuth, content, agents, and the scheduler. A custom extension follows the same recipe: implement the trait, call the macro, recompile, redeploy the same artifact.

  • Compile-time discovery — register_extension! expands to inventory::submit!, which writes a factory into a linker section. ExtensionRegistry::discover() walks inventory::iter at startup and instantiates each one.
  • Typed extension surface — The Extension trait defines around thirty methods covering schemas, jobs, routes, providers, and renderers. Each extension contributes only the surfaces it implements; the rest fall through to default impls.
  • Three library extensions in the template — The systemprompt-web template links the Web, Marketplace, and Email extensions into the same binary as the host. A custom extension follows the same recipe.

Profile-Based Environments Without Rebuilding

Environment drift is the slow killer of self-hosted infrastructure. Local config diverges from staging, staging diverges from production, and a config-only change ships an outage. systemprompt.io makes the environment a first-class file in the repo and binds the binary to it through one variable.

At startup, ProfileBootstrap::init() reads SYSTEMPROMPT_PROFILE, loads profile.yaml, and stores the resulting Profile in a static OnceLock<Profile>. Every subsystem reads from that single value: rate-limit tiers, JWT issuer and audience, log level, database connection, storage paths. Switching environments is changing the variable and restarting the process; the binary does not change.

A profile is a directory checked into version control. The local profile disables rate limits and turns logging up; the production profile sets tiered rate limits, JSON logging, and a JWT configuration with issuer, audiences, and expiration. Per-region or per-tenant profiles follow the same shape: a directory, a YAML file, a tenant id. Environment differences are diffable in git instead of buried in a config server.

  • One variable, one source of truth — ProfileBootstrap::init() reads SYSTEMPROMPT_PROFILE and stores the loaded Profile in a static OnceLock. Every subsystem reads from the same value.
  • Profiles live in git — A profile is a YAML file with security (JWT issuer, audiences, expiration), rate limits (per-tier), and runtime config (log level, environment). Diffs are visible at review time, not at incident time.
  • Per-tenant, per-region — Each profile carries its own tenant id, database connection, and rate-limit tiers. The same binary serves a directory of them.

CLI and Server in One Binary

Most stacks ship one binary for the server, another for the CLI, and a third dashboard image to operate it. Three things to package, three things to version, three things to keep in sync. systemprompt.io is one artifact playing all three roles.

The same binary parses subcommands with clap and dispatches: systemprompt services start --foreground brings up the Axum HTTP server, systemprompt admin agents list runs against the database, systemprompt infra logs view queries the log store. The CLI is not a wrapper around the server; it is the server, invoked differently.

The production container reflects that. The Dockerfile starts from debian:bookworm-slim, installs libpq5 and libssl3, copies the pre-built binary into /app/bin/, sets a HEALTHCHECK against /api/v1/health, and runs the entrypoint. There is no Rust toolchain in the image and no multi-stage compile. A developer runs the same binary locally that runs in production, with a different profile.

  • CLI and server share a binary — cli::run() parses the subcommand with clap and dispatches into either the API server or a CLI operation. Same binary, same code paths, same profile.
  • Slim production image — FROM debian:bookworm-slim. Install libpq5 and libssl3. COPY the pre-built binary. HEALTHCHECK against /api/v1/health. No Rust toolchain in the image.
  • Readiness signal — The server broadcasts API_READY when it is accepting connections, so liveness probes do not have to guess.

Scale Horizontally, No Distributed Infrastructure

Scaling a typical AI gateway means scaling the cache and the session store with it. Add a replica, add capacity to Redis, watch for hot keys. systemprompt.io takes those moving parts off the table by keeping per-request state stateless and per-process state local.

JWT validation runs inside the request, not against a database or an external auth service. JwtService::new constructs a local DecodingKey from the profile secret; validate_token verifies signature and expiry without touching I/O. Any replica can serve any request, so horizontal scaling is N processes behind a load balancer.

Rate limiting is local too. TieredRateLimiter holds one keyed limiter per auth tier (admin, user, MCP, A2A, service, anonymous), each constructed from the governor crate with a quota and burst multiplier read from the profile. Limits do not need to be coordinated across replicas: each process meters its own share of incoming traffic, sized by the deployment topology.

Health and readiness ride on the same binary. /api/v1/health verifies database connectivity for liveness probes. The readiness layer flips an atomic API_READY flag and broadcasts when the server starts accepting connections. The database layer manages its own connection pool; no external pooler is required.

  • Stateless JWT validation — JwtService verifies signature and expiry against a local DecodingKey from the profile secret. No database lookup per request, no session store, no external auth call.
  • Per-tier rate limiting in process — TieredRateLimiter holds one governor-backed limiter per auth tier (admin, user, MCP, A2A, service, anon), sized by the profile.
  • Health and readiness in the binary — /api/v1/health checks the database for liveness. The readiness layer broadcasts API_READY when the server is accepting connections.

Deploy Anywhere Linux Runs

A regulated team picking AI infrastructure asks one question early: can this run somewhere we trust, including a network with no internet route. systemprompt.io answers that question by being a binary, a database, and a profile, and nothing else.

The binary runs on any Linux x86_64 host with libpq5 and libssl3 available. The container image is the same binary on debian:bookworm-slim. There is no cloud-specific packaging and no environment-specific build path. Copy the artifact, point it at PostgreSQL, set the profile, run it.

Air-gapped deployment is a configuration, not a fork. The binary is its own token issuer: JwtService::generate_admin_token signs tokens locally with HS256 using the profile secret. No Auth0, no Okta, no external identity service is required. Logging writes to the PostgreSQL database. The only outbound network calls are to PostgreSQL and to whichever AI providers a profile explicitly configures.

  • Docker and Kubernetes — debian:bookworm-slim image, in-image HEALTHCHECK against /api/v1/health every 30 seconds. No init containers, no service mesh.
  • Bare metal and VMs — Copy the binary to a Linux host with libpq5 and libssl3. Run ./systemprompt services start --foreground. No container runtime required.
  • Air-gapped networks — Self-issued JWT tokens via local HS256 signing. Logging to PostgreSQL. Outbound network calls only to PostgreSQL and to AI providers a profile explicitly configures.

Founder-led. Self-service first.

No sales team. No demo theatre. The template is free to evaluate — if it solves your problem, we talk.

Who we are

One founder, one binary, full IP ownership. Every line of Rust, every governance rule, every MCP integration — written in-house. Two years of building AI governance infrastructure from first principles. No venture capital dictating roadmap. No advisory board approving features.

How to engage

One binary. One database. Your infrastructure.

Clone the template, link your extensions into the same binary, and deploy the same artifact to every environment.