Insert content that embeds close to trusted documents or contains hidden instructions.
LLM08:2025
Vector and Embedding Weaknesses
Vector and embedding weaknesses affect RAG systems when embeddings, vector stores, chunking, metadata, retrieval filters, or similarity logic expose or distort context.
Step 01
Input
Step 02
Model
Step 03
Tool / Data
Step 04
Impact
What it is
The retrieval layer does not enforce real authorization, source integrity, trust level, freshness, or ranking controls before content enters model context.
Why it matters
Weak vector controls can create cross-tenant leakage, stale answers, poisoned context, hidden prompt injection, and decisions based on irrelevant or unauthorized material.
Failure path
How it usually fails.
A useful review breaks this chain before the system reaches production data, tools, or customer-facing decisions.
Exploit weak metadata filters, tenant boundaries, chunking, or ranking behavior.
Cause unauthorized, poisoned, stale, or irrelevant chunks to steer the final answer.
Defenses
Controls worth checking.
The strongest controls are enforced outside the model and can be retested after a prompt, model, or workflow change.
Enforce retrieval authorization
Apply tenant, role, document, and classification constraints before similarity search results reach the model.
Track chunk lineage
Carry source, owner, timestamp, trust level, and access metadata through ingestion, retrieval, answer generation, and citations.
Test adversarial retrieval
Probe similar-looking documents, poisoned text, stale chunks, hidden instructions, and cross-tenant collisions.
Signals to review
- Answers citing chunks from the wrong tenant, role, source, or document class.
- High-confidence answers based on stale or low-trust material.
- Retrieved chunks containing instructions intended for the model.
Questions for your team
- Are vector filters mandatory and server-side constructed?
- Can users influence retrieved context through uploaded content?
- Does each answer preserve source and authorization lineage?
