Architecture & Why Rust
50MB single binary with zero runtime dependencies. Memory-safe multi-tenancy on Tokio. Learn why SystemPrompt is built in Rust and how the agentic loop works.
On this page
SystemPrompt compiles to a 50MB single binary that deploys anywhere with zero runtime dependencies. No Python virtual environments. No Node modules. No Docker-in-Docker. One file runs your entire AI infrastructure.
Source: 33 crates organized into five dependency layers
Why Rust for AI Infrastructure
AI infrastructure has unique requirements that Rust handles exceptionally well:
Memory Safety for Multi-Tenancy
Multi-tenant AI systems process requests from hundreds of users simultaneously. Memory bugs in this context are catastrophic:
- Buffer overflows could leak User A's data to User B
- Use-after-free bugs could crash the entire system
- Data races could corrupt shared state
Rust eliminates these classes of bugs at compile time. If your code compiles, these vulnerabilities don't exist.
// Rust's ownership system prevents data races
// This code won't compile - Rust catches the error
let mut data = vec![1, 2, 3];
let reference = &data[0];
data.push(4); // Compile error: cannot borrow `data` as mutable
println!("{}", reference);
Async-First on Tokio
AI workloads involve lots of waiting:
- Waiting for LLM API responses (seconds)
- Waiting for database queries (milliseconds)
- Waiting for file I/O (milliseconds)
Rust's async/await with Tokio handles thousands of concurrent connections efficiently:
// Handle thousands of concurrent AI requests
async fn handle_request(req: Request) -> Response {
let user = authenticate(&req).await?;
let response = call_llm(&req.prompt).await?;
log_request(&user, &response).await?;
Response::ok(response)
}
One SystemPrompt instance handles what would require multiple Node.js processes or Python workers.
Zero-Cost Abstractions
High-level code with low-level performance. Extension traits, generics, and iterators compile to the same machine code as hand-written loops:
// This high-level code...
let active_users: Vec<_> = users
.iter()
.filter(|u| u.is_active())
.map(|u| u.id)
.collect();
// ...compiles to the same assembly as a manual loop
Compile-Time Guarantees
If it compiles, it works. Rust's type system catches errors before they reach production:
| Error Type | Python/Node | Rust |
|---|---|---|
| Type mismatches | Runtime crash | Compile error |
| Null pointer access | Runtime crash | Compile error |
| Unhandled errors | Silent failure | Compile error |
| Thread safety bugs | Race conditions | Compile error |
The 50MB Binary
Everything you need in one file:
# That's it. One file. Run anywhere.
./systemprompt infra services start --all
What's included:
- HTTP/HTTPS server (Axum)
- OAuth2/OIDC authorization server
- WebAuthn authentication
- MCP server hosting
- Agent runtime (A2A protocol)
- Job scheduler
- Database migrations
- Static file server
- All extensions
What's NOT required:
- Runtime interpreters (Python, Node)
- External web servers (nginx, Apache)
- Separate auth services (Keycloak, Auth0)
- Message queues (Redis, RabbitMQ)
Five-Layer Architecture
Dependencies flow downward only. Each layer can only import from layers below it:
┌───────────────────────────────────────────────────────────────┐
│ ENTRY: api, cli │
│ HTTP endpoints, CLI commands, request handling │
├───────────────────────────────────────────────────────────────┤
│ APP: runtime, scheduler, generator, sync │
│ Application orchestration, job scheduling, content gen │
├───────────────────────────────────────────────────────────────┤
│ DOMAIN: users, oauth, ai, agent, mcp, files, content │
│ Business logic, domain models, core functionality │
├───────────────────────────────────────────────────────────────┤
│ INFRA: database, events, security, config, logging │
│ Infrastructure concerns, persistence, cross-cutting │
├───────────────────────────────────────────────────────────────┤
│ SHARED: models, traits, identifiers, extension │
│ Common types, traits, identifiers used everywhere │
└───────────────────────────────────────────────────────────────┘
This layering ensures:
- Testability: Domain logic has no infrastructure dependencies
- Maintainability: Changes are isolated to appropriate layers
- Clarity: Easy to understand where code belongs
The Agentic Loop
SystemPrompt implements a complete agentic loop with memory, retention, and self-learning:
┌─────────────────────────────────────────────────────────────┐
│ AGENTIC LOOP │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ INPUT │───▶│ PROCESS │───▶│ OUTPUT │ │
│ │ Request │ │ + LLM │ │ Response │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ ▲ │ │ │
│ │ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ CONTEXT │◀───│ MEMORY │◀───│ANALYTICS │ │
│ │ Retrieval│ │ Storage │ │ Tracking │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ │
│ │ LEARN │ │
│ │ Optimize │ │
│ └──────────┘ │
└─────────────────────────────────────────────────────────────┘
Memory
Every interaction is stored:
- Session context and history
- User preferences and patterns
- Agent performance data
- Tool call results
Retention
Context persists across sessions:
- Shared contexts between agents
- User-specific memory
- Project-level knowledge bases
Self-Learning
Continuous optimization through:
- Usage analytics and patterns
- Cost tracking and optimization
- Performance monitoring
- Feedback-driven improvements
Extension System
SystemPrompt is a library, not a platform. You compile it into YOUR binary:
use systemprompt::prelude::*;
struct MyExtension;
impl Extension for MyExtension {
fn id(&self) -> &'static str { "my-extension" }
fn name(&self) -> &'static str { "My Extension" }
}
impl ApiExtension for MyExtension {
fn router(&self, ctx: &ExtensionContext) -> Option<Router> {
Some(Router::new()
.route("/my-api", get(my_handler))
.with_state(ctx.clone()))
}
}
register_extension!(MyExtension);
Extensions are discovered at compile time via the inventory crate. No runtime reflection. No configuration files for extension discovery. If you import it, it's included.
Deployment Options
The same binary runs everywhere:
| Environment | Command |
|---|---|
| Local dev | ./systemprompt infra services start --all |
| Docker | docker run systemprompt/systemprompt |
| Kubernetes | Standard deployment manifest |
| Bare metal | Copy binary, run |
| Cloud (managed) | One-click deploy |
No special runtime requirements. PostgreSQL is the only external dependency.
Performance Characteristics
| Metric | Value |
|---|---|
| Binary size | ~50MB |
| Startup time | <1 second |
| Memory baseline | ~30MB |
| Concurrent connections | 10,000+ |
| Request latency (p99) | <5ms (excluding LLM) |
Related
- Core Concepts Overview — SystemPrompt fundamentals
- Extensions — Building on the core
- Technical Specs — Detailed specifications
| Previous | Next |
|---|---|
| Extensions | Features Overview |