Disclosure: I built systemprompt.io. Secret detection is one of its governance features. I will explain the problem, the detection patterns, and the architectural solution. Where systemprompt.io has gaps, I will say so.
How Secrets Actually Leak
Most discussions about AI security focus on prompt injection, hallucination, and model safety. These are real problems. They are also well-studied problems with growing mitigation strategies. The problem that gets almost no attention is credential leakage through tool calls.
Here is how it happens in practice.
Scenario 1: The .env in the Context Window
A developer is using an AI coding assistant to debug a database connection issue. The agent reads the project's .env file to understand the configuration. The .env file contains DATABASE_URL=postgres://admin:s3cr3t_p@ssw0rd@db.internal:5432/production. The credential is now in the AI provider's context window.
If the agent subsequently calls a tool — any tool — that includes a database connection as a parameter, the credential may be sent as part of the tool call. It is now in the tool's logs, the provider's request logs, and potentially in transit to an external API.
The developer did not do anything wrong. They asked the agent to help with a database issue. The agent read the relevant configuration file. The credential was in the file. This is the normal, expected workflow, and it leaks secrets.
Scenario 2: The Tool Call Parameter
An AI agent is configured to query a third-party API. The agent's system prompt includes the instruction: "Use the API key stored in the environment variable STRIPE_API_KEY." The agent constructs a tool call to the HTTP client tool with the API key as a parameter in the request header.
The tool call parameters, including the API key, are visible in three places: the AI provider's context window (because the model generated the tool call), the governance layer's audit log (because the tool call was logged), and the tool's own execution logs.
Even if the tool call succeeds and the API key is never displayed to the user, it has been transmitted through multiple systems and logged in multiple locations. Any of those locations could be compromised, audited, or accessed by someone who should not have the key.
Scenario 3: The MCP Request
An agent using MCP (Model Context Protocol) to interact with external services constructs a tool call that includes a service account credential in the parameters. The credential was in the agent's context because a previous tool call returned it as part of a configuration response.
This is the chain-of-contamination problem. A credential enters the context window through any vector — reading a file, receiving a tool response, parsing a user message — and then propagates through subsequent tool calls. The agent does not know the string is a credential. It is just text.
Why This Is Different from Traditional Secret Leaks
Traditional secret leaks happen when credentials are committed to version control, logged in application output, or exposed through misconfigured services. These are well-understood risks with mature tooling. GitHub's secret scanning, GitGuardian, TruffleHog, and similar tools catch secrets in code repositories.
AI agent credential leaks are different in three ways.
First, the leak vector is the AI model itself. The model generates tool calls that include credentials as parameters. The model does not understand that a string is a credential. It processes text. A database URL, an API key, and a regular string are all the same to the model.
Second, the leak is transient. Unlike a committed secret that persists in version control history, a leaked credential in a tool call exists in the context window for the duration of the session and in logs for the duration of the retention period. It may never appear in a git repository, which means traditional secret scanning tools will never find it.
Third, the leak crosses trust boundaries. When an agent sends a tool call to an MCP server, the parameters cross from the agent's trust boundary to the server's trust boundary. If the MCP server is operated by a third party, the credential has left your infrastructure.
OWASP ASI03: Identity and Privilege Abuse
The OWASP Top 10 for Agentic Applications 2026 categorises this risk under ASI03: Identity and Privilege Abuse. The specific concern is that leaked credentials allow agents — or attackers who gain access to agent sessions — to operate far beyond the intended scope.
Consider the chain of events. An agent leaks a database credential through a tool call. An attacker who has access to the tool's logs (or the AI provider's logs, or the governance layer's logs) now has a production database credential. The original agent had limited permissions — it could query one table. The leaked credential has admin access because the .env file contained the admin connection string, not a restricted one.
This is privilege escalation through credential leakage. The agent's configured scope was narrow. The leaked credential's scope was broad. The attacker now operates with the credential's scope, not the agent's scope.
OWASP's mitigation guidance for ASI03 includes: avoid embedding credentials in agent contexts, use short-lived tokens, implement credential rotation, and scan for credential patterns in agent communications. These are the right recommendations. The question is how to implement them in practice.
Detection Patterns
Secret detection works by scanning text for patterns that match known credential formats. This is fundamentally a pattern matching problem with two competing requirements: catch as many real credentials as possible (recall) and generate as few false positives as possible (precision).
Here are the categories of secrets to detect, with example patterns for each.
Cloud Provider Keys
AWS Access Keys follow a distinctive pattern: 20 uppercase alphanumeric characters starting with AKIA. The corresponding secret access key is 40 characters of mixed case alphanumeric plus / and +.
AKIA[0-9A-Z]{16}
AWS secret access keys are harder to detect because their format is a generic base64-like string. Detection typically relies on proximity — a 40-character high-entropy string near an AKIA pattern is likely a secret access key.
Google Cloud service account keys are JSON objects containing a private_key field with an RSA private key. Detecting the "type": "service_account" pattern in tool call parameters catches these reliably.
Azure subscription keys are 32-character hexadecimal strings. Their format is generic enough that detection relies on context — the string appears near Azure-related URLs or parameter names like Ocp-Apim-Subscription-Key.
Version Control Tokens
GitHub tokens follow predictable prefixes. Personal access tokens start with ghp_, OAuth tokens with gho_, app tokens with ghs_, and fine-grained tokens with github_pat_. The prefix makes detection highly reliable.
(ghp|gho|ghs|github_pat)_[A-Za-z0-9_]{36,255}
GitLab tokens use the glpat- prefix for personal access tokens. Bitbucket app passwords are 20-character alphanumeric strings without a distinctive prefix, making them harder to detect without context.
Database Credentials
Connection strings follow well-known formats:
(postgres|mysql|mongodb|redis)://[^:]+:[^@]+@[^/]+
This pattern catches postgres://user:password@host/db and similar formats across common database engines. The credential is between the :// and the @ — specifically the portion after the first : and before the @.
API Keys
Stripe keys start with sk_live_ or sk_test_ for secret keys and pk_live_ or pk_test_ for publishable keys. The sk_live_ prefix indicates a production secret key and should always trigger a high-severity alert.
Twilio account SIDs start with AC followed by 32 hexadecimal characters. Auth tokens are 32-character hexadecimal strings.
SendGrid API keys start with SG. followed by a base64-encoded string.
Authentication Tokens
JWT tokens are three base64url-encoded segments separated by dots. The first segment decodes to a JSON object containing "alg" and optionally "typ": "JWT".
eyJ[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_-]{10,}
SSH private keys start with -----BEGIN and a key type identifier (RSA PRIVATE KEY, OPENSSH PRIVATE KEY, EC PRIVATE KEY). These are unmistakable and should always trigger immediate alerts.
Slack tokens use the xoxb-, xoxp-, and xoxs- prefixes. Bot tokens (xoxb-) and user tokens (xoxp-) should both trigger alerts.
Generic High-Entropy Strings
Not all credentials follow recognisable patterns. Some are just random strings assigned by a service. For these, entropy-based detection is the fallback.
A high-entropy string is one with a high degree of randomness relative to its length. The word "password" has low entropy. The string a8f2k9d3m7n1p4q6 has high entropy. Strings above a certain entropy threshold that appear in parameter positions typically occupied by credentials (headers, connection strings, authentication fields) are flagged for review.
Entropy-based detection has a higher false positive rate than pattern-based detection. It should be used as a supplement, not a replacement. Flag high-entropy strings for review rather than blocking them outright, unless they appear in explicitly credential-typed parameters.
The Architecture That Solves It
Detection is necessary but not sufficient. Catching a secret in a tool call parameter after the agent has already constructed the tool call means the secret was in the agent's context window. Blocking the tool call prevents the secret from reaching the tool, but it does not remove the secret from the context.
The architectural solution is to prevent credentials from entering the agent's context in the first place. This is server-side credential injection.
How Server-Side Credential Injection Works
The principle is simple: the AI agent never holds real credentials. Instead, it holds resolution tokens — short-lived references that the MCP server resolves to actual credentials at execution time.
Here is the flow.
Step 1: Agent requests a tool call. The agent wants to query a database. Instead of constructing a connection string with real credentials, it references a resolution token. The tool call parameters contain something like {"connection": "resolve:db_production", "query": "SELECT ..."} rather than {"connection": "postgres://admin:password@host/db", "query": "SELECT ..."}.
Step 2: Governance pipeline scans parameters. The governance layer inspects the tool call parameters for secrets before forwarding to the MCP server. The resolution token resolve:db_production is not a credential. It is a reference. No secrets detected.
Step 3: MCP server resolves the token. The MCP server receives the tool call, recognises the resolution token, and resolves it to the actual credential. The credential is retrieved from encrypted storage — ChaCha20-Poly1305 encrypted, per-user keys — decrypted in memory, and used to execute the operation.
Step 4: MCP server executes the operation. The database query runs with real credentials. The MCP server holds the credential in memory only for the duration of the operation. It is not logged. It is not returned to the agent.
Step 5: Result returned to agent. The agent receives the query results. It never received the credential. It never had the credential in its context. No credential leak is possible because the credential was never in the agent's trust boundary.
Why Resolution Tokens Work
Resolution tokens work because they separate two concerns that are conflated in traditional agent architectures: knowing what resource to access and having the credentials to access it.
The agent needs to know "I want to query the production database." It does not need to know the database password. The resolution token carries the intent ("access this resource") without the credential ("using this password").
Resolution tokens are also short-lived. A token issued for a specific tool call expires after five minutes. Even if the token leaks (through logs, context window exposure, or tool call forwarding), it is useless after expiration. The actual credential remains in encrypted storage, never exposed.
The Encryption Layer
Credentials in storage are encrypted with ChaCha20-Poly1305, which is an authenticated encryption algorithm that provides both confidentiality and integrity. Each user's credentials are encrypted with a per-user key, which means a compromise of one user's key does not expose other users' credentials.
The encryption happens at rest and during resolution. The credential exists in plaintext only in the MCP server's memory, only for the duration of the tool execution, and is zeroed after use. It is never written to disk in plaintext, never logged, and never returned in API responses.
ChaCha20-Poly1305 was chosen over AES-GCM for two reasons. First, it does not require hardware AES support, which matters for deployment on diverse infrastructure including ARM-based servers and air-gapped environments with older hardware. Second, its nonce management is simpler, reducing the risk of implementation errors that lead to nonce reuse (which catastrophically breaks AES-GCM security).
Pre-Execution vs Post-Execution Detection
There is a critical timing distinction in secret detection that determines whether you prevent leaks or merely detect them.
Pre-Execution Detection
Pre-execution detection scans tool call parameters before the tool receives the request. If a secret is detected, the tool call is blocked. The secret never reaches the tool, never appears in the tool's logs, and never crosses the trust boundary.
This is the only approach that actually prevents credential leakage. Everything else is damage assessment.
Pre-execution detection operates in the governance pipeline between the agent's tool call request and the tool's execution. When an agent generates a tool call, the governance pipeline:
- Receives the tool call request
- Evaluates governance policies (permissions, scope, rate limits)
- Scans parameters for secret patterns
- If a secret is detected: blocks the call, logs the detection event, returns a structured error to the agent
- If no secret is detected: forwards the call to the tool for execution
The agent receives a clear error: "Tool call blocked: potential credential detected in parameter 'connection'. Use a resolution token instead of raw credentials." This gives the agent enough information to retry with the correct approach.
Post-Execution Detection
Post-execution detection scans logs and event streams after tool calls have completed. It detects secrets that were sent to tools, logged in audit trails, and potentially transmitted to external services.
Post-execution detection is valuable for several reasons. It catches secrets that pre-execution detection missed (no pattern set is perfect). It identifies credential exposure in historical logs. It provides forensic evidence for incident response. But it does not prevent the leak. By the time post-execution detection finds a credential in a tool call log, the credential has already been sent to the tool.
Many governance solutions only provide post-execution detection. They monitor logs, flag anomalies, and alert security teams. This is observability, not prevention. The distinction matters.
The Practical Difference
Pre-execution: "We blocked a tool call that contained an AWS access key before it reached the HTTP client tool. The key never left the governance boundary. No rotation required."
Post-execution: "We detected an AWS access key in yesterday's tool call logs. The key was sent to an external API 14 hours ago. Rotate the key immediately and investigate what the API did with it."
Both are useful. Only one prevents the incident.
What Most Governance Tools Miss
The AI governance market in 2026 is growing rapidly. Most solutions focus on model-level safety: preventing harmful outputs, enforcing usage policies, managing access controls for AI models. These are important capabilities. They are also insufficient for the credential leakage problem.
Claude Enterprise
Claude Enterprise provides strong access controls for who can use Claude and what they can do. Usage policies govern model behaviour. But there is no secret detection layer for tool call parameters. If an agent using Claude Enterprise constructs a tool call that includes a credential, Claude Enterprise does not scan for it. The credential passes through to the tool.
This is not a criticism of Claude Enterprise's design. It is a model governance platform, not a tool governance platform. The gap exists because the two categories solve different problems.
Microsoft Agent Governance Toolkit
Microsoft's AGT provides a comprehensive set of governance components, including a policy engine that can enforce rules on tool calls. Secret detection is possible through AGT's extensibility — you can write custom policy rules that match credential patterns. But secret detection is not a built-in, maintained feature with a curated pattern library. You are building it yourself.
AGT's focus is policy enforcement, identity management, and sandboxing. These are the right priorities for a governance toolkit. Secret detection is a specialised concern that benefits from a maintained pattern library that evolves as new credential formats emerge.
MCP Gateways
MCP gateways (MintMCP, TrueFoundry MCP Gateway, Lunar.dev) sit between agents and MCP servers, providing authentication and audit. Some offer basic secret scanning at the protocol layer. The scanning is typically limited to a small set of well-known patterns (AWS keys, GitHub tokens) and does not include server-side credential injection as an alternative architecture.
MCP gateways are a lightweight, fast-to-deploy option for adding a basic security layer. For organisations that need comprehensive secret detection with server-side credential injection, a gateway is a partial solution.
The Common Gap
The common gap across most governance tools is that they treat secret detection as a logging problem rather than a prevention problem. They will tell you that a credential appeared in a tool call. They will not block the tool call before it executes. And none of them address the root cause: credentials should not be in the agent's context in the first place.
Prevention requires two capabilities working together. First, pre-execution scanning that blocks tool calls containing credentials before they reach the tool. Second, an architectural alternative (server-side credential injection) that eliminates the need for credentials in the agent's context. Without both, you are playing defence against a problem that has an architectural solution.
How systemprompt.io Addresses This
I will be specific about what is implemented today.
35+ detection patterns. systemprompt.io's secret scanner maintains a curated library of over 35 credential patterns covering AWS access keys, AWS secret keys, Google Cloud service account keys, Azure subscription keys, GitHub tokens (all prefixes), GitLab tokens, Bitbucket passwords, database connection strings (PostgreSQL, MySQL, MongoDB, Redis), Stripe keys, Twilio credentials, SendGrid keys, Slack tokens, JWT tokens, SSH private keys, PGP private keys, generic API key patterns, and high-entropy string detection for unknown credential formats.
The pattern library is updated with each release. When new credential formats emerge (new cloud providers, new SaaS APIs, new token prefix schemes), patterns are added to the library. This is the maintenance burden that comes with curated detection — and the reason it works better than roll-your-own regex.
Pre-execution scanning. Secret detection runs in the governance pipeline before tool calls are forwarded to MCP servers. A tool call containing a detected credential is blocked. The agent receives a structured error explaining what was detected and why the call was blocked. The detection event is logged with the secret type, the tool that would have received it, and the user whose session contained the secret.
Server-side credential injection. MCP servers managed by systemprompt.io support resolution tokens. Credentials are stored encrypted with ChaCha20-Poly1305 using per-user keys. Resolution tokens are issued per-request with a five-minute expiry. The MCP server resolves tokens to credentials at execution time. The agent never holds real credentials.
Governance pipeline integration. Secret detection is not a standalone feature. It is integrated into the same governance pipeline that handles policy evaluation, RBAC, cost tracking, and audit logging. A secret detection event generates a structured audit event that flows to your SIEM through the same paths described in our SIEM integration guide. Alert on it, correlate it, and investigate it with the same tools your security team uses for everything else.
What is not yet implemented. I will be honest. systemprompt.io does not currently perform semantic analysis of tool call parameters to detect credentials that do not match any known pattern and do not trigger entropy-based detection. If a credential is a normal-looking word (a dictionary word used as a password, for example), pattern-based and entropy-based detection will miss it. This is a known limitation shared by every pattern-based scanner. Semantic detection is on the roadmap but not shipped.
For the full list of detection patterns, encryption specifications, and integration documentation, see the secrets management feature page.
Building Your Defence
If you are starting from zero on AI agent secret detection, here is the practical implementation sequence.
Step 1: Audit your current exposure. Before building detection, understand what you are dealing with. Review your AI agent configurations. Which agents have access to files containing credentials? Which agents construct tool calls with authentication parameters? Which MCP servers receive credentials from agents? This audit tells you where to focus.
Step 2: Implement pre-execution scanning. Add credential pattern matching to your governance pipeline. Start with the highest-risk patterns: AWS keys, database connection strings, and your organisation's internal API keys. Expand the pattern library over time. Pre-execution scanning is the single most impactful mitigation.
Step 3: Adopt server-side credential injection. Migrate your MCP servers from direct credential handling to resolution tokens. This is the architectural change that eliminates the root cause. It requires changes to how MCP servers handle authentication, but it permanently removes credentials from the agent's context.
Step 4: Monitor and iterate. Track detection events. Review false positives and adjust patterns. Review false negatives by periodically auditing tool call logs for credentials that your scanner missed. No pattern library is perfect. Continuous improvement is part of the operational model.
The organisations that take AI agent security seriously are the ones that treat credentials in AI agent contexts with the same urgency as credentials committed to public GitHub repositories. The blast radius is different — a leaked agent credential may not be immediately public — but the risk is real, the attack surface is growing, and the mitigation is available today.
Do not wait for the incident to build the defence.