AI Services
Configure and manage AI providers in systemprompt.io. Supports Anthropic, OpenAI, and Google Gemini with streaming, web search grounding, image generation, smart routing, and MCP tool integration.
On this page
TL;DR: The AI service is the unified interface between your agents and language models. Configure one or more providers (Anthropic, OpenAI, Google Gemini), set a default, and systemprompt.io handles provider selection, fallback, MCP tool integration, web search grounding, and image generation. All provider details are abstracted behind AiService so agents and MCP tools work identically regardless of which model is active.
Why It Matters
Agents need access to language models for reasoning, generation, and tool use. Different providers have different strengths: Anthropic Claude excels at complex reasoning and safety, OpenAI GPT models offer broad compatibility, and Google Gemini provides native Google Search grounding and multimodal image generation.
Rather than coupling agent code to a single provider, the AI service gives you a single configuration surface. You enable providers, set a default, and every agent and MCP server in your deployment uses the same AiService instance. Switching providers is a one-line YAML change with no code modifications.
Architecture Overview
The AI service sits between agents (or MCP tool handlers) and the external provider APIs. The flow is:
Agent / MCP Tool
|
v
AiService <-- unified interface
|
+---> Anthropic API (Claude models)
+---> OpenAI API (GPT models)
+---> Gemini API (Gemini models)
|
v
History / Logging
When an agent or MCP tool calls ai_service.generate(), the service:
- Selects the target provider based on the default (or per-request override).
- Formats the request for that provider's API.
- Sends the request and streams or collects the response.
- Logs the interaction to conversation history.
- Returns a unified
AiResponsewith the generated content.
MCP servers initialize AiService from the shared configuration at startup:
let ai_service = Arc::new(
AiService::new(db_pool.clone(), &services_config.ai, tool_provider, None)?,
);
This means every MCP server and agent in your deployment shares the same provider configuration.
Configuration
All AI settings live in services/ai/config.yaml, which is included by the top-level services/config/config.yaml aggregation file.
# services/ai/config.yaml
ai:
default_provider: gemini
default_max_output_tokens: 8192
sampling:
enable_smart_routing: false
fallback_enabled: true
providers:
anthropic:
enabled: true
api_key: ${ANTHROPIC_API_KEY}
default_model: claude-sonnet-4-20250514
google_search_enabled: true
openai:
enabled: true
api_key: ${OPENAI_API_KEY}
default_model: gpt-4-turbo
google_search_enabled: true
gemini:
enabled: true
api_key: ${GEMINI_API_KEY}
endpoint: https://generativelanguage.googleapis.com/v1beta
default_model: gemini-2.5-flash
google_search_enabled: true
mcp:
auto_discover: true
connect_timeout_ms: 5000
execution_timeout_ms: 30000
retry_attempts: 3
history:
retention_days: 30
log_tool_executions: true
Key fields:
default_provider-- The provider used when no override is specified. Valid values:anthropic,openai,gemini.default_max_output_tokens-- Global cap on response length. Individual requests can override this (e.g.,4096for cross-provider compatibility).sampling.fallback_enabled-- Whentrue, if the primary provider fails, the service tries other enabled providers.google_search_enabled-- Enables web search grounding for providers that support it.
Supported Providers
Anthropic
Claude models from Anthropic, known for strong reasoning and safety.
anthropic:
enabled: true
api_key: ${ANTHROPIC_API_KEY}
default_model: claude-sonnet-4-20250514
Available models:
| Model | Characteristics |
|---|---|
claude-opus-4-20250514 |
Most capable, best for complex multi-step tasks |
claude-sonnet-4-20250514 |
Balanced performance and cost |
claude-haiku-3-20240307 |
Fast and economical for simple tasks |
OpenAI
GPT models from OpenAI with broad ecosystem compatibility.
openai:
enabled: true
api_key: ${OPENAI_API_KEY}
default_model: gpt-4-turbo
Available models:
| Model | Characteristics |
|---|---|
gpt-4-turbo |
Latest GPT-4 with large context window |
gpt-4o |
Optimized for speed |
gpt-3.5-turbo |
Fast and economical |
Google Gemini
Google's Gemini models with native Google Search grounding and image generation.
gemini:
enabled: true
api_key: ${GEMINI_API_KEY}
endpoint: https://generativelanguage.googleapis.com/v1beta
default_model: gemini-2.5-flash
Available models:
| Model | Characteristics |
|---|---|
gemini-2.5-flash |
Fast multimodal processing |
gemini-2.5-pro |
Advanced reasoning |
gemini-2.5-flash-image |
Image generation |
gemini-3-pro-image-preview |
Higher quality image generation |
Features
Streaming
Agents declare streaming support in their card configuration:
capabilities:
streaming: true
When streaming is enabled, responses are delivered incrementally via Server-Sent Events (SSE) rather than waiting for the full response. The admin dashboard uses SSE for real-time updates, and agents can stream partial results to clients as they are generated.
Web Search Grounding
Some providers support web search for grounded responses with real-time information. Enable it per-provider with google_search_enabled: true.
| Provider | Implementation | Notes |
|---|---|---|
| Gemini | Google Search grounding | Native integration, returns sources with citations |
| OpenAI | /responses API with web_search tool |
Uses web search tool type |
| Anthropic | web_search_20250305 tool |
API support available |
Web search is used through AiService.generate_with_google_search(), which accepts a GoogleSearchParams struct:
let search_params = GoogleSearchParams {
messages,
sampling: None,
max_output_tokens: 8192,
model: None, // uses provider default
urls: None,
response_schema: None,
};
let response = ai_service.generate_with_google_search(search_params).await?;
// response.content -- the generated text
// response.sources -- list of SourceCitation with title, uri, relevance
// response.web_search_queries -- queries the provider executed
Tools like research_blog in the content-manager MCP server use this to produce research artifacts backed by real search results.
Image Generation
The AI service supports image generation through ImageService, which shares the same provider credentials.
| Provider | Models | Resolutions | Aspect Ratios | Batch |
|---|---|---|---|---|
| Gemini | gemini-2.5-flash-image, gemini-3-pro-image-preview |
1K, 2K, 4K | Square, 16:9, 9:16, 4:3, 3:4, UltraWide | Yes |
| OpenAI | dall-e-3, dall-e-2 |
1K | Square, 16:9, 9:16 | No |
The image service automatically selects the best resolution the provider supports (4K > 2K > 1K). Provider selection follows the same logic as text generation: use the default provider if it supports images, otherwise fall back to the first available image-capable provider.
# Generate an image through the content-manager MCP server
systemprompt plugins mcp call content-manager generate_featured_image -a '{
"skill_id": "blog_image_generation",
"topic": "AI Development",
"title": "Building with AI",
"summary": "A guide to AI development"
}' --timeout 120
Generated images are stored in /files/images/generated/ and served through the configured URL prefix.
Smart Routing and Fallback
When enable_smart_routing is true, the service selects the best provider for each request based on task characteristics and provider availability. When fallback_enabled is true, if the primary provider returns an error, the service retries with other enabled providers.
sampling:
enable_smart_routing: true
fallback_enabled: true
The retry logic uses exponential backoff. For example, the research_blog tool retries up to 3 times with delays of 1s, 2s, and 4s between attempts.
MCP Tool Integration
The AI service auto-discovers MCP servers and registers their tools so language models can call them during conversations.
mcp:
auto_discover: true
connect_timeout_ms: 5000
execution_timeout_ms: 30000
retry_attempts: 3
When auto_discover is enabled, the AI service finds all configured MCP servers at startup and makes their tools available. During a conversation, the language model can invoke any registered tool. The service handles serialization, timeout enforcement, and retry logic.
Conversation History
The AI service logs conversations and tool executions for debugging and analytics.
history:
retention_days: 30
log_tool_executions: true
History data feeds the admin dashboard (timeline, popular skills, tool success rates) and is available through the analytics CLI commands.
Using AiService in Code
MCP tool handlers receive AiService as a dependency. There are two primary methods:
Standard generation with ai_service.generate():
let request = AiRequest::builder(
messages,
ai_service.default_provider(),
ai_service.default_model(),
4096, // max output tokens
ctx,
)
.build();
let response = ai_service.generate(&request).await?;
// response.content contains the generated text
Web search grounding with ai_service.generate_with_google_search():
let params = GoogleSearchParams {
messages,
sampling: None,
max_output_tokens: 8192,
model: None,
urls: None,
response_schema: None,
};
let response = ai_service.generate_with_google_search(params).await?;
Both methods respect the configured provider, model, and token limits from services/ai/config.yaml.
Environment Variables
Store API keys securely using environment variables. Never commit keys to configuration files.
# Set via CLI
systemprompt cloud secrets set ANTHROPIC_API_KEY "sk-ant-..."
systemprompt cloud secrets set OPENAI_API_KEY "sk-..."
systemprompt cloud secrets set GEMINI_API_KEY "AIza..."
# List configured secrets
systemprompt cloud secrets list
Use the ${VAR_NAME} syntax in YAML to reference environment variables at runtime.
Configuration Reference
| Field | Type | Default | Description |
|---|---|---|---|
default_provider |
string | -- | Primary provider: anthropic, openai, or gemini |
default_max_output_tokens |
number | 8192 |
Global maximum tokens for responses |
sampling.enable_smart_routing |
boolean | false |
Enable intelligent provider selection per request |
sampling.fallback_enabled |
boolean | true |
Try alternative providers on failure |
providers.<name>.enabled |
boolean | -- | Whether this provider is active |
providers.<name>.api_key |
string | -- | API key (use ${VAR} syntax) |
providers.<name>.default_model |
string | -- | Default model for this provider |
providers.<name>.endpoint |
string | -- | Custom API endpoint (Gemini only) |
providers.<name>.google_search_enabled |
boolean | false |
Enable web search grounding |
mcp.auto_discover |
boolean | true |
Auto-discover MCP servers and register tools |
mcp.connect_timeout_ms |
number | 5000 |
MCP server connection timeout |
mcp.execution_timeout_ms |
number | 30000 |
MCP tool execution timeout |
mcp.retry_attempts |
number | 3 |
Retries for failed MCP tool calls |
history.retention_days |
number | 30 |
Days to retain conversation history |
history.log_tool_executions |
boolean | true |
Log tool calls to history |
CLI Reference
Provider Management
| Command | Description |
|---|---|
systemprompt admin config provider list |
View all providers with status |
systemprompt admin config provider set <PROVIDER> |
Set the default provider |
systemprompt admin config provider enable <PROVIDER> |
Enable a provider |
systemprompt admin config provider disable <PROVIDER> |
Disable a provider |
Secrets Management
| Command | Description |
|---|---|
systemprompt cloud secrets set ANTHROPIC_API_KEY <key> |
Set Anthropic API key |
systemprompt cloud secrets set OPENAI_API_KEY <key> |
Set OpenAI API key |
systemprompt cloud secrets set GEMINI_API_KEY <key> |
Set Gemini API key |
systemprompt cloud secrets list |
List configured secrets |
Other AI Commands
| Command | Description |
|---|---|
systemprompt admin config show |
Show current configuration including AI settings |
systemprompt plugins mcp list |
List MCP servers integrated with AI |
systemprompt analytics costs |
View AI usage costs |
See systemprompt admin config provider --help for detailed options.
Service Relationships
The AI service connects to several other services in systemprompt.io:
- Agents -- Provides LLM capabilities for agent reasoning and generation.
- MCP Servers -- Auto-discovers tools that language models can call during conversations.
- Skills -- Skills load system prompts that shape AI behavior for specific tasks.
- Config -- AI configuration is included through the service aggregation pattern in
services/config/config.yaml. - Analytics -- Conversation history and tool execution logs feed the analytics dashboard.
- Scheduler -- Scheduled jobs can trigger AI-powered workflows.
Troubleshooting
Provider authentication failed -- Verify the API key is set correctly. Run systemprompt cloud secrets list to confirm the variable exists, then check that the provider's api_key field references it with ${VAR_NAME} syntax.
Tool execution timeout -- The MCP tool took longer than execution_timeout_ms. Increase the timeout in the mcp section or optimize the tool handler. Check tool-specific logs with systemprompt plugins mcp logs <server-name>.
No providers available -- At least one provider must be enabled with valid credentials. Run systemprompt admin config provider list to see which providers are configured and their status.
Empty response from web search -- Some queries trigger safety or recitation filters. The service retries automatically (up to retry_attempts times with exponential backoff). If the issue persists, try rephrasing the query or switching providers.
Image generation failed -- Confirm the provider supports image generation (Gemini or OpenAI). Check that ImageProviderFactory initialized successfully in the MCP server logs. Run systemprompt infra logs view --level error --since 1h to find specific errors.