Skip to main content

AI Services

Configure and manage AI providers in systemprompt.io. Supports Anthropic, OpenAI, and Google Gemini with streaming, web search grounding, image generation, smart routing, and MCP tool integration.

TL;DR: The AI service is the unified interface between your agents and language models. Configure one or more providers (Anthropic, OpenAI, Google Gemini), set a default, and systemprompt.io handles provider selection, fallback, MCP tool integration, web search grounding, and image generation. All provider details are abstracted behind AiService so agents and MCP tools work identically regardless of which model is active.

Why It Matters

Agents need access to language models for reasoning, generation, and tool use. Different providers have different strengths: Anthropic Claude excels at complex reasoning and safety, OpenAI GPT models offer broad compatibility, and Google Gemini provides native Google Search grounding and multimodal image generation.

Rather than coupling agent code to a single provider, the AI service gives you a single configuration surface. You enable providers, set a default, and every agent and MCP server in your deployment uses the same AiService instance. Switching providers is a one-line YAML change with no code modifications.

Architecture Overview

The AI service sits between agents (or MCP tool handlers) and the external provider APIs. The flow is:

Agent / MCP Tool
      |
      v
  AiService        <-- unified interface
      |
      +---> Anthropic API  (Claude models)
      +---> OpenAI API     (GPT models)
      +---> Gemini API     (Gemini models)
      |
      v
  History / Logging

When an agent or MCP tool calls ai_service.generate(), the service:

  1. Selects the target provider based on the default (or per-request override).
  2. Formats the request for that provider's API.
  3. Sends the request and streams or collects the response.
  4. Logs the interaction to conversation history.
  5. Returns a unified AiResponse with the generated content.

MCP servers initialize AiService from the shared configuration at startup:

let ai_service = Arc::new(
    AiService::new(db_pool.clone(), &services_config.ai, tool_provider, None)?,
);

This means every MCP server and agent in your deployment shares the same provider configuration.

Configuration

All AI settings live in services/ai/config.yaml, which is included by the top-level services/config/config.yaml aggregation file.

# services/ai/config.yaml
ai:
  default_provider: gemini
  default_max_output_tokens: 8192

  sampling:
    enable_smart_routing: false
    fallback_enabled: true

  providers:
    anthropic:
      enabled: true
      api_key: ${ANTHROPIC_API_KEY}
      default_model: claude-sonnet-4-20250514
      google_search_enabled: true

    openai:
      enabled: true
      api_key: ${OPENAI_API_KEY}
      default_model: gpt-4-turbo
      google_search_enabled: true

    gemini:
      enabled: true
      api_key: ${GEMINI_API_KEY}
      endpoint: https://generativelanguage.googleapis.com/v1beta
      default_model: gemini-2.5-flash
      google_search_enabled: true

  mcp:
    auto_discover: true
    connect_timeout_ms: 5000
    execution_timeout_ms: 30000
    retry_attempts: 3

  history:
    retention_days: 30
    log_tool_executions: true

Key fields:

  • default_provider -- The provider used when no override is specified. Valid values: anthropic, openai, gemini.
  • default_max_output_tokens -- Global cap on response length. Individual requests can override this (e.g., 4096 for cross-provider compatibility).
  • sampling.fallback_enabled -- When true, if the primary provider fails, the service tries other enabled providers.
  • google_search_enabled -- Enables web search grounding for providers that support it.

Supported Providers

Anthropic

Claude models from Anthropic, known for strong reasoning and safety.

anthropic:
  enabled: true
  api_key: ${ANTHROPIC_API_KEY}
  default_model: claude-sonnet-4-20250514

Available models:

Model Characteristics
claude-opus-4-20250514 Most capable, best for complex multi-step tasks
claude-sonnet-4-20250514 Balanced performance and cost
claude-haiku-3-20240307 Fast and economical for simple tasks

OpenAI

GPT models from OpenAI with broad ecosystem compatibility.

openai:
  enabled: true
  api_key: ${OPENAI_API_KEY}
  default_model: gpt-4-turbo

Available models:

Model Characteristics
gpt-4-turbo Latest GPT-4 with large context window
gpt-4o Optimized for speed
gpt-3.5-turbo Fast and economical

Google Gemini

Google's Gemini models with native Google Search grounding and image generation.

gemini:
  enabled: true
  api_key: ${GEMINI_API_KEY}
  endpoint: https://generativelanguage.googleapis.com/v1beta
  default_model: gemini-2.5-flash

Available models:

Model Characteristics
gemini-2.5-flash Fast multimodal processing
gemini-2.5-pro Advanced reasoning
gemini-2.5-flash-image Image generation
gemini-3-pro-image-preview Higher quality image generation

Features

Streaming

Agents declare streaming support in their card configuration:

capabilities:
  streaming: true

When streaming is enabled, responses are delivered incrementally via Server-Sent Events (SSE) rather than waiting for the full response. The admin dashboard uses SSE for real-time updates, and agents can stream partial results to clients as they are generated.

Web Search Grounding

Some providers support web search for grounded responses with real-time information. Enable it per-provider with google_search_enabled: true.

Provider Implementation Notes
Gemini Google Search grounding Native integration, returns sources with citations
OpenAI /responses API with web_search tool Uses web search tool type
Anthropic web_search_20250305 tool API support available

Web search is used through AiService.generate_with_google_search(), which accepts a GoogleSearchParams struct:

let search_params = GoogleSearchParams {
    messages,
    sampling: None,
    max_output_tokens: 8192,
    model: None,       // uses provider default
    urls: None,
    response_schema: None,
};

let response = ai_service.generate_with_google_search(search_params).await?;
// response.content   -- the generated text
// response.sources   -- list of SourceCitation with title, uri, relevance
// response.web_search_queries -- queries the provider executed

Tools like research_blog in the content-manager MCP server use this to produce research artifacts backed by real search results.

Image Generation

The AI service supports image generation through ImageService, which shares the same provider credentials.

Provider Models Resolutions Aspect Ratios Batch
Gemini gemini-2.5-flash-image, gemini-3-pro-image-preview 1K, 2K, 4K Square, 16:9, 9:16, 4:3, 3:4, UltraWide Yes
OpenAI dall-e-3, dall-e-2 1K Square, 16:9, 9:16 No

The image service automatically selects the best resolution the provider supports (4K > 2K > 1K). Provider selection follows the same logic as text generation: use the default provider if it supports images, otherwise fall back to the first available image-capable provider.

# Generate an image through the content-manager MCP server
systemprompt plugins mcp call content-manager generate_featured_image -a '{
  "skill_id": "blog_image_generation",
  "topic": "AI Development",
  "title": "Building with AI",
  "summary": "A guide to AI development"
}' --timeout 120

Generated images are stored in /files/images/generated/ and served through the configured URL prefix.

Smart Routing and Fallback

When enable_smart_routing is true, the service selects the best provider for each request based on task characteristics and provider availability. When fallback_enabled is true, if the primary provider returns an error, the service retries with other enabled providers.

sampling:
  enable_smart_routing: true
  fallback_enabled: true

The retry logic uses exponential backoff. For example, the research_blog tool retries up to 3 times with delays of 1s, 2s, and 4s between attempts.

MCP Tool Integration

The AI service auto-discovers MCP servers and registers their tools so language models can call them during conversations.

mcp:
  auto_discover: true
  connect_timeout_ms: 5000
  execution_timeout_ms: 30000
  retry_attempts: 3

When auto_discover is enabled, the AI service finds all configured MCP servers at startup and makes their tools available. During a conversation, the language model can invoke any registered tool. The service handles serialization, timeout enforcement, and retry logic.

Conversation History

The AI service logs conversations and tool executions for debugging and analytics.

history:
  retention_days: 30
  log_tool_executions: true

History data feeds the admin dashboard (timeline, popular skills, tool success rates) and is available through the analytics CLI commands.

Using AiService in Code

MCP tool handlers receive AiService as a dependency. There are two primary methods:

Standard generation with ai_service.generate():

let request = AiRequest::builder(
    messages,
    ai_service.default_provider(),
    ai_service.default_model(),
    4096,    // max output tokens
    ctx,
)
.build();

let response = ai_service.generate(&request).await?;
// response.content contains the generated text

Web search grounding with ai_service.generate_with_google_search():

let params = GoogleSearchParams {
    messages,
    sampling: None,
    max_output_tokens: 8192,
    model: None,
    urls: None,
    response_schema: None,
};

let response = ai_service.generate_with_google_search(params).await?;

Both methods respect the configured provider, model, and token limits from services/ai/config.yaml.

Environment Variables

Store API keys securely using environment variables. Never commit keys to configuration files.

# Set via CLI
systemprompt cloud secrets set ANTHROPIC_API_KEY "sk-ant-..."
systemprompt cloud secrets set OPENAI_API_KEY "sk-..."
systemprompt cloud secrets set GEMINI_API_KEY "AIza..."

# List configured secrets
systemprompt cloud secrets list

Use the ${VAR_NAME} syntax in YAML to reference environment variables at runtime.

Configuration Reference

Field Type Default Description
default_provider string -- Primary provider: anthropic, openai, or gemini
default_max_output_tokens number 8192 Global maximum tokens for responses
sampling.enable_smart_routing boolean false Enable intelligent provider selection per request
sampling.fallback_enabled boolean true Try alternative providers on failure
providers.<name>.enabled boolean -- Whether this provider is active
providers.<name>.api_key string -- API key (use ${VAR} syntax)
providers.<name>.default_model string -- Default model for this provider
providers.<name>.endpoint string -- Custom API endpoint (Gemini only)
providers.<name>.google_search_enabled boolean false Enable web search grounding
mcp.auto_discover boolean true Auto-discover MCP servers and register tools
mcp.connect_timeout_ms number 5000 MCP server connection timeout
mcp.execution_timeout_ms number 30000 MCP tool execution timeout
mcp.retry_attempts number 3 Retries for failed MCP tool calls
history.retention_days number 30 Days to retain conversation history
history.log_tool_executions boolean true Log tool calls to history

CLI Reference

Provider Management

Command Description
systemprompt admin config provider list View all providers with status
systemprompt admin config provider set <PROVIDER> Set the default provider
systemprompt admin config provider enable <PROVIDER> Enable a provider
systemprompt admin config provider disable <PROVIDER> Disable a provider

Secrets Management

Command Description
systemprompt cloud secrets set ANTHROPIC_API_KEY <key> Set Anthropic API key
systemprompt cloud secrets set OPENAI_API_KEY <key> Set OpenAI API key
systemprompt cloud secrets set GEMINI_API_KEY <key> Set Gemini API key
systemprompt cloud secrets list List configured secrets

Other AI Commands

Command Description
systemprompt admin config show Show current configuration including AI settings
systemprompt plugins mcp list List MCP servers integrated with AI
systemprompt analytics costs View AI usage costs

See systemprompt admin config provider --help for detailed options.

Service Relationships

The AI service connects to several other services in systemprompt.io:

  • Agents -- Provides LLM capabilities for agent reasoning and generation.
  • MCP Servers -- Auto-discovers tools that language models can call during conversations.
  • Skills -- Skills load system prompts that shape AI behavior for specific tasks.
  • Config -- AI configuration is included through the service aggregation pattern in services/config/config.yaml.
  • Analytics -- Conversation history and tool execution logs feed the analytics dashboard.
  • Scheduler -- Scheduled jobs can trigger AI-powered workflows.

Troubleshooting

Provider authentication failed -- Verify the API key is set correctly. Run systemprompt cloud secrets list to confirm the variable exists, then check that the provider's api_key field references it with ${VAR_NAME} syntax.

Tool execution timeout -- The MCP tool took longer than execution_timeout_ms. Increase the timeout in the mcp section or optimize the tool handler. Check tool-specific logs with systemprompt plugins mcp logs <server-name>.

No providers available -- At least one provider must be enabled with valid credentials. Run systemprompt admin config provider list to see which providers are configured and their status.

Empty response from web search -- Some queries trigger safety or recitation filters. The service retries automatically (up to retry_attempts times with exponential backoff). If the issue persists, try rephrasing the query or switching providers.

Image generation failed -- Confirm the provider supports image generation (Gemini or OpenAI). Check that ImageProviderFactory initialized successfully in the MCP server logs. Run systemprompt infra logs view --level error --since 1h to find specific errors.