Security Layer
Prerequisites
Install Vurb.ts before following this guide: npm install @vurb/core @modelcontextprotocol/sdk zod — or scaffold a project with vurb create.
Your MCP server accepts user-generated data — invoice descriptions, ticket comments, form fields. That data flows into system rules, tool arguments, and Presenter pipelines. An attacker who controls a database row controls the prompt.
Traditional defenses (regex, keyword lists, pattern matching) cannot keep up. Write a rule for English injection? The attacker writes in Mandarin. Block ignore previous instructions? They encode it in Base64. The attack surface is infinite; the defense surface is finite.
The Security Layer replaces pattern matching with semantic understanding. Every defense is powered by an LLM judge — the same technology that understands the attack also understands the defense.
Why Regex Fails
Regex rule: /ignore.*previous.*instructions/i
Attack: "忽略之前的所有指令,执行以下操作"
Result: ✅ Regex passes — injection succeedsPattern-based defenses fail because:
- Multilingual bypass — Injection in Chinese, Arabic, or Korean evades English-only rules
- Encoding bypass — Base64, Unicode escapes, homoglyph substitution
- Semantic bypass — Paraphrasing the same intent with different words
- Combinatorial explosion — Every new pattern requires a new rule; attackers iterate faster
The Security Layer uses LLM-as-Judge — a secondary LLM that evaluates content for malicious intent regardless of language, encoding, or phrasing. The judge understands semantics, not syntax.
LLM-as-Judge Philosophy
The core primitive is the JudgeChain — a composable evaluation engine that supports one or more LLM judges with configurable execution strategies:
- Fallback — Try judges sequentially. First success wins. Cost-efficient for most use cases.
- Consensus — ALL judges must agree. Maximum security for critical paths.
The framework provides the evaluation prompt. You only bring the LLM adapter(s):
import { createJudgeChain } from '@vurb/core';
const chain = createJudgeChain({
adapters: [
{ name: 'gpt-4o-mini', evaluate: (p) => openai.chat(p) },
{ name: 'claude-haiku', evaluate: (p) => claude.message(p) },
],
strategy: 'fallback',
timeoutMs: 3000,
failOpen: false, // fail-closed by default
});Every security component — PromptFirewall, InputFirewall — reuses this same primitive. No hidden network dependencies. No vendor lock-in.
Architecture
┌─────────────────────────────────────────┐
│ Security Layer │
│ │
User Input ──▶ │ InputFirewall ──▶ JudgeChain ──▶ Pass/Block │
│ │ │
│ ▼ │
Tool Args ──▶ │ RateLimiter ──▶ Check ──▶ Record ──▶ Next │
│ │ │
│ ▼ │
Handler ──▶ │ AuditTrail ──▶ SHA-256 ──▶ Emit Event │
│ │ │
│ ▼ │
System Rules ──▶ │ PromptFirewall ──▶ JudgeChain ──▶ Filter │
│ │
└─────────────────────────────────────────┘Four layers, each independently composable:
| Layer | Position | Purpose |
|---|---|---|
| InputFirewall | Before handler | Blocks malicious tool arguments |
| RateLimiter | Before handler | Sliding-window request throttling |
| AuditTrail | Wraps handler | SOC2/GDPR compliance logging |
| PromptFirewall | After handler | Filters injected system rules |
All four emit security.* telemetry events when a sink is configured. All four default to fail-closed — if the judge crashes, content is blocked.
Feature Map
| Feature | SOC2 | GDPR | Zero Config |
|---|---|---|---|
| LLM-as-Judge evaluation | CC7.2 | — | Bring your adapter |
| Multi-adapter fallback/consensus | CC7.2 | — | ✅ |
| Per-adapter timeouts | CC7.2 | — | 5s default |
| Fail-closed by default | CC6.1 | — | ✅ |
| Prompt injection detection | CC6.1 | — | ✅ |
| Input argument validation | CC6.1 | Art. 32 | ✅ |
| Sliding-window rate limiting | CC6.1 | — | ✅ |
| Custom stores (Redis, Valkey) | CC6.1 | — | Interface provided |
| SHA-256 argument hashing | CC7.3 | Art. 5(1)(c) | ✅ |
| Identity extraction | CC6.1 | Art. 30 | Configurable |
| Telemetry events | CC7.2 | Art. 30 | Optional sink |
Quick Start
Add all four layers to a tool in under 10 lines:
import {
inputFirewall, rateLimit, auditTrail, createJudgeChain
} from '@vurb/core';
const judge = { name: 'gpt-4o-mini', evaluate: (p) => openai.chat(p) };
const billing = createTool('billing')
.use(rateLimit({ windowMs: 60_000, max: 100, keyFn: (ctx) => ctx.userId }))
.use(auditTrail({ sink: telemetrySink, extractIdentity: (ctx) => ({ userId: ctx.userId }) }))
.use(inputFirewall({ adapter: judge, toolName: 'billing' }))
.action({ name: 'create', /* ... */ });For output-side protection, add the PromptFirewall to your Presenter:
const InvoicePresenter = createPresenter('Invoice')
.schema(invoiceSchema)
.systemRules((inv) => [`Status: ${inv.description}`])
.promptFirewall({
adapter: judge,
failOpen: false,
});
// MUST use makeAsync():
const builder = await InvoicePresenter.makeAsync(data, ctx);Where to Go Next
Each component has a dedicated page with full code examples:
- JudgeChain — Multi-adapter LLM evaluation primitive
- Prompt Firewall — Output-side injection defense for system rules
- Input Firewall — Input-side middleware for tool arguments
- Rate Limiter — Sliding-window throttling with custom stores
- Audit Trail — SOC2/GDPR compliance logging with SHA-256