AI Services

Configure and manage AI providers in systemprompt.io. Supports Anthropic, OpenAI, and Google Gemini with streaming, web search grounding, image generation, smart routing, and MCP tool integration.

Last updated:

TL;DR: The AI service is the unified interface between your agents and language models. Configure one or more providers (Anthropic, OpenAI, Google Gemini), set a default, and systemprompt.io handles provider selection, fallback, MCP tool integration, web search grounding, and image generation. All provider details are abstracted behind AiService so agents and MCP tools work identically regardless of which model is active.

Why It Matters

Agents need access to language models for reasoning, generation, and tool use. Different providers have different strengths: Anthropic Claude excels at complex reasoning and safety, OpenAI GPT models offer broad compatibility, and Google Gemini provides native Google Search grounding and multimodal image generation.

Rather than coupling agent code to a single provider, the AI service gives you a single configuration surface. You enable providers, set a default, and every agent and MCP server in your deployment uses the same AiService instance. Switching providers is a one-line YAML change with no code modifications.

Architecture Overview

The AI service sits between agents (or MCP tool handlers) and the external provider APIs. The flow is:

Agent / MCP Tool
      |
      v
  AiService        <-- unified interface
      |
      +---> Anthropic API  (Claude models)
      +---> OpenAI API     (GPT models)
      +---> Gemini API     (Gemini models)
      |
      v
  History / Logging

When an agent or MCP tool calls ai_service.generate(), the service:

Selects the target provider based on the default (or per-request override).
Formats the request for that provider's API.
Sends the request and streams or collects the response.
Logs the interaction to conversation history.
Returns a unified AiResponse with the generated content.

MCP servers initialize AiService from the shared configuration at startup:

let ai_service = Arc::new(
    AiService::new(db_pool.clone(), &services_config.ai, tool_provider, None)?,
);

This means every MCP server and agent in your deployment shares the same provider configuration.

Configuration

All AI settings live in services/ai/config.yaml, which is included by the top-level services/config/config.yaml aggregation file.

# services/ai/config.yaml
ai:
  default_provider: gemini
  default_max_output_tokens: 8192

  sampling:
    enable_smart_routing: false
    fallback_enabled: true

  providers:
    anthropic:
      enabled: true
      api_key: ${ANTHROPIC_API_KEY}
      default_model: claude-sonnet-4-20250514
      google_search_enabled: true

    openai:
      enabled: true
      api_key: ${OPENAI_API_KEY}
      default_model: gpt-4-turbo
      google_search_enabled: true

    gemini:
      enabled: true
      api_key: ${GEMINI_API_KEY}
      endpoint: https://generativelanguage.googleapis.com/v1beta
      default_model: gemini-2.5-flash
      google_search_enabled: true

  mcp:
    auto_discover: true
    connect_timeout_ms: 5000
    execution_timeout_ms: 30000
    retry_attempts: 3

  history:
    retention_days: 30
    log_tool_executions: true

Key fields:

default_provider -- The provider used when no override is specified. Valid values: anthropic, openai, gemini.
default_max_output_tokens -- Global cap on response length. Individual requests can override this (e.g., 4096 for cross-provider compatibility).
sampling.fallback_enabled -- When true, if the primary provider fails, the service tries other enabled providers.
google_search_enabled -- Enables web search grounding for providers that support it.

Supported Providers

Anthropic

Claude models from Anthropic, known for strong reasoning and safety.

anthropic:
  enabled: true
  api_key: ${ANTHROPIC_API_KEY}
  default_model: claude-sonnet-4-20250514

Available models:

Model	Characteristics
`claude-opus-4-20250514`	Most capable, best for complex multi-step tasks
`claude-sonnet-4-20250514`	Balanced performance and cost
`claude-haiku-3-20240307`	Fast and economical for simple tasks

OpenAI

GPT models from OpenAI with broad ecosystem compatibility.

openai:
  enabled: true
  api_key: ${OPENAI_API_KEY}
  default_model: gpt-4-turbo

Available models:

Model	Characteristics
`gpt-4-turbo`	Latest GPT-4 with large context window
`gpt-4o`	Optimized for speed
`gpt-3.5-turbo`	Fast and economical

Google Gemini

Google's Gemini models with native Google Search grounding and image generation.

gemini:
  enabled: true
  api_key: ${GEMINI_API_KEY}
  endpoint: https://generativelanguage.googleapis.com/v1beta
  default_model: gemini-2.5-flash

Available models:

Model	Characteristics
`gemini-2.5-flash`	Fast multimodal processing
`gemini-2.5-pro`	Advanced reasoning
`gemini-2.5-flash-image`	Image generation
`gemini-3-pro-image-preview`	Higher quality image generation

Features

Streaming

Agents declare streaming support in their card configuration:

capabilities:
  streaming: true

When streaming is enabled, responses are delivered incrementally via Server-Sent Events (SSE) rather than waiting for the full response. The admin dashboard uses SSE for real-time updates, and agents can stream partial results to clients as they are generated.

Web Search Grounding

Some providers support web search for grounded responses with real-time information. Enable it per-provider with google_search_enabled: true.

Provider	Implementation	Notes
Gemini	Google Search grounding	Native integration, returns sources with citations
OpenAI	`/responses` API with `web_search` tool	Uses web search tool type
Anthropic	`web_search_20250305` tool	API support available

Web search is used through AiService.generate_with_google_search(), which accepts a GoogleSearchParams struct:

let search_params = GoogleSearchParams {
    messages,
    sampling: None,
    max_output_tokens: 8192,
    model: None,       // uses provider default
    urls: None,
    response_schema: None,
};

let response = ai_service.generate_with_google_search(search_params).await?;
// response.content   -- the generated text
// response.sources   -- list of SourceCitation with title, uri, relevance
// response.web_search_queries -- queries the provider executed

Tools like research_blog in the content-manager MCP server use this to produce research artifacts backed by real search results.

Image Generation

The AI service supports image generation through ImageService, which shares the same provider credentials.

Provider	Models	Resolutions	Aspect Ratios	Batch
Gemini	`gemini-2.5-flash-image`, `gemini-3-pro-image-preview`	1K, 2K, 4K	Square, 16:9, 9:16, 4:3, 3:4, UltraWide	Yes
OpenAI	`dall-e-3`, `dall-e-2`	1K	Square, 16:9, 9:16	No

The image service automatically selects the best resolution the provider supports (4K > 2K > 1K). Provider selection follows the same logic as text generation: use the default provider if it supports images, otherwise fall back to the first available image-capable provider.

# Generate an image through the content-manager MCP server
systemprompt plugins mcp call content-manager generate_featured_image -a '{
  "skill_id": "blog_image_generation",
  "topic": "AI Development",
  "title": "Building with AI",
  "summary": "A guide to AI development"
}' --timeout 120

Generated images are stored in /files/images/generated/ and served through the configured URL prefix.

Smart Routing and Fallback

When enable_smart_routing is true, the service selects the best provider for each request based on task characteristics and provider availability. When fallback_enabled is true, if the primary provider returns an error, the service retries with other enabled providers.

sampling:
  enable_smart_routing: true
  fallback_enabled: true

The retry logic uses exponential backoff. For example, the research_blog tool retries up to 3 times with delays of 1s, 2s, and 4s between attempts.

MCP Tool Integration

The AI service auto-discovers MCP servers and registers their tools so language models can call them during conversations.

mcp:
  auto_discover: true
  connect_timeout_ms: 5000
  execution_timeout_ms: 30000
  retry_attempts: 3

When auto_discover is enabled, the AI service finds all configured MCP servers at startup and makes their tools available. During a conversation, the language model can invoke any registered tool. The service handles serialization, timeout enforcement, and retry logic.

Conversation History

The AI service logs conversations and tool executions for debugging and analytics.

history:
  retention_days: 30
  log_tool_executions: true

History data feeds the admin dashboard (timeline, popular skills, tool success rates) and is available through the analytics CLI commands.

Using AiService in Code

MCP tool handlers receive AiService as a dependency. There are two primary methods:

Standard generation with ai_service.generate():

let request = AiRequest::builder(
    messages,
    ai_service.default_provider(),
    ai_service.default_model(),
    4096,    // max output tokens
    ctx,
)
.build();

let response = ai_service.generate(&request).await?;
// response.content contains the generated text

Web search grounding with ai_service.generate_with_google_search():

let params = GoogleSearchParams {
    messages,
    sampling: None,
    max_output_tokens: 8192,
    model: None,
    urls: None,
    response_schema: None,
};

let response = ai_service.generate_with_google_search(params).await?;

Both methods respect the configured provider, model, and token limits from services/ai/config.yaml.

Environment Variables

Store API keys securely using environment variables. Never commit keys to configuration files.

# Set via CLI
systemprompt cloud secrets set ANTHROPIC_API_KEY "sk-ant-..."
systemprompt cloud secrets set OPENAI_API_KEY "sk-..."
systemprompt cloud secrets set GEMINI_API_KEY "AIza..."

# List configured secrets
systemprompt cloud secrets list

Use the ${VAR_NAME} syntax in YAML to reference environment variables at runtime.

Configuration Reference

Field	Type	Default	Description
`default_provider`	string	--	Primary provider: `anthropic`, `openai`, or `gemini`
`default_max_output_tokens`	number	`8192`	Global maximum tokens for responses
`sampling.enable_smart_routing`	boolean	`false`	Enable intelligent provider selection per request
`sampling.fallback_enabled`	boolean	`true`	Try alternative providers on failure
`providers.<name>.enabled`	boolean	--	Whether this provider is active
`providers.<name>.api_key`	string	--	API key (use `${VAR}` syntax)
`providers.<name>.default_model`	string	--	Default model for this provider
`providers.<name>.endpoint`	string	--	Custom API endpoint (Gemini only)
`providers.<name>.google_search_enabled`	boolean	`false`	Enable web search grounding
`mcp.auto_discover`	boolean	`true`	Auto-discover MCP servers and register tools
`mcp.connect_timeout_ms`	number	`5000`	MCP server connection timeout
`mcp.execution_timeout_ms`	number	`30000`	MCP tool execution timeout
`mcp.retry_attempts`	number	`3`	Retries for failed MCP tool calls
`history.retention_days`	number	`30`	Days to retain conversation history
`history.log_tool_executions`	boolean	`true`	Log tool calls to history

CLI Reference

Provider Management

Command	Description
`systemprompt admin config provider list`	View all providers with status
`systemprompt admin config provider set <PROVIDER>`	Set the default provider
`systemprompt admin config provider enable <PROVIDER>`	Enable a provider
`systemprompt admin config provider disable <PROVIDER>`	Disable a provider

Secrets Management

Command	Description
`systemprompt cloud secrets set ANTHROPIC_API_KEY <key>`	Set Anthropic API key
`systemprompt cloud secrets set OPENAI_API_KEY <key>`	Set OpenAI API key
`systemprompt cloud secrets set GEMINI_API_KEY <key>`	Set Gemini API key
`systemprompt cloud secrets list`	List configured secrets

Other AI Commands

Command	Description
`systemprompt admin config show`	Show current configuration including AI settings
`systemprompt plugins mcp list`	List MCP servers integrated with AI
`systemprompt analytics costs`	View AI usage costs

See systemprompt admin config provider --help for detailed options.

Service Relationships

The AI service connects to several other services in systemprompt.io:

Agents -- Provides LLM capabilities for agent reasoning and generation.
MCP Servers -- Auto-discovers tools that language models can call during conversations.
Skills -- Skills load system prompts that shape AI behavior for specific tasks.
Config -- AI configuration is included through the service aggregation pattern in services/config/config.yaml.
Analytics -- Conversation history and tool execution logs feed the analytics dashboard.
Scheduler -- Scheduled jobs can trigger AI-powered workflows.

Troubleshooting

Provider authentication failed -- Verify the API key is set correctly. Run systemprompt cloud secrets list to confirm the variable exists, then check that the provider's api_key field references it with ${VAR_NAME} syntax.

Tool execution timeout -- The MCP tool took longer than execution_timeout_ms. Increase the timeout in the mcp section or optimize the tool handler. Check tool-specific logs with systemprompt plugins mcp logs <server-name>.

No providers available -- At least one provider must be enabled with valid credentials. Run systemprompt admin config provider list to see which providers are configured and their status.

Empty response from web search -- Some queries trigger safety or recitation filters. The service retries automatically (up to retry_attempts times with exponential backoff). If the issue persists, try rephrasing the query or switching providers.

Image generation failed -- Confirm the provider supports image generation (Gemini or OpenAI). Check that ImageProviderFactory initialized successfully in the MCP server logs. Run systemprompt infra logs view --level error --since 1h to find specific errors.

Why It Matters

Architecture Overview

Configuration

Supported Providers

Anthropic

OpenAI

Google Gemini

Features

Streaming

Web Search Grounding

Image Generation

Smart Routing and Fallback

MCP Tool Integration

Conversation History

Using AiService in Code

Environment Variables

Configuration Reference

CLI Reference

Provider Management

Secrets Management

Other AI Commands

Service Relationships

Troubleshooting

Get the build log · plus the Enterprise Factsheet