Ask for hidden context, prior conversation state, system messages, or internal data.
LLM02:2025
Sensitive Information Disclosure
Sensitive information disclosure occurs when an AI system exposes secrets, regulated data, customer records, proprietary material, or internal context through outputs, logs, tools, or retrieval.
Step 01
Input
Step 02
Model
Step 03
Tool / Data
Step 04
Impact
What it is
The system allows sensitive material to enter model context or downstream traces without strong minimization, authorization, redaction, and output controls.
Why it matters
A single leakage path can affect customer trust, contractual commitments, regulatory exposure, enterprise security review, and internal operating security.
Failure path
How it usually fails.
A useful review breaks this chain before the system reaches production data, tools, or customer-facing decisions.
Use prompt injection or RAG confusion to pull restricted material into the answer.
Extract secrets from logs, tool output, traces, embeddings, or generated files.
Defenses
Controls worth checking.
The strongest controls are enforced outside the model and can be retested after a prompt, model, or workflow change.
Minimize context
Only send the model the fields required for the task, and avoid placing secrets, keys, tokens, or broad customer records in prompts.
Authorize before retrieval
Enforce tenant, role, document, and classification filters before content enters the retrieval result set.
Redact traces and outputs
Apply redaction to prompts, logs, tool responses, and final outputs, then verify that redaction survives retries and error paths.
Signals to review
- Unexpected secrets, tokens, customer identifiers, or private document names in responses.
- Logs containing raw prompts, tool results, or retrieved chunks with sensitive fields.
- Cross-tenant chunks appearing in citations or answer context.
Questions for your team
- What sensitive data can reach the model at runtime?
- Are retrieval filters enforced server-side or only described in prompts?
- Do observability tools receive sensitive prompt or tool data?
