What are Runtime Guards in Vurb.ts?

Runtime Guards are two built-in safety mechanisms: (1) the Concurrency Guard (Bulkhead pattern) limits simultaneous tool executions using a semaphore with backpressure queue, and (2) the Egress Guard truncates oversized response payloads at the byte level. Both have zero overhead when not configured — no guard objects are created.

How does the Concurrency Guard prevent thundering herd?

The Concurrency Guard implements a per-tool semaphore with configurable maxActive slots and maxQueue capacity. When the LLM fires 50 concurrent calls, only maxActive execute simultaneously. Excess calls are either queued (up to maxQueue) or immediately rejected with a SERVER_BUSY error — preventing downstream API rate limiting and cascade failures.

What happens when a tool is at capacity?

When all active slots and queue positions are full, the tool returns a structured toolError with code SERVER_BUSY. The error includes a recovery suggestion telling the LLM to reduce concurrent calls and retry sequentially. This self-healing error causes the LLM to naturally slow down its cadence — no manual intervention needed.

How does the Egress Guard prevent OOM crashes?

The Egress Guard measures the total UTF-8 byte length of all content blocks in a ToolResponse. If it exceeds maxPayloadBytes, the text is truncated at a safe character boundary and a system intervention message is injected: "You MUST use pagination (limit/offset) or filters." This prevents Node.js OOM crashes from serializing large payloads and protects against LLM context window overflow.

Does Vurb.ts comply with the MCP rate limiting requirement?

Yes. The MCP specification requires servers to rate limit tool invocations. The .concurrency() method on the builder fulfills this requirement at the framework level. Without it, developers must implement rate limiting manually per tool — which is error-prone and inconsistent.

How do Runtime Guards work with AbortSignal?

The Concurrency Guard cooperates with AbortSignal for queued waiters. If a user cancels while a call is waiting in the backpressure queue, the waiter is immediately rejected without ever executing handler code. Active executions use the existing Cancellation pipeline. The concurrency slot is always released via try/finally — no leaks.

How do I test Runtime Guards?

For concurrency: fire multiple tool.execute() calls simultaneously and assert the Nth call returns SERVER_BUSY. For egress: return a large payload (e.g., 10,000 characters) with maxPayloadBytes set low (2048) and verify the response contains SYSTEM INTERVENTION. Both guards work with direct builder.execute() — no server mock needed.

What is the Intent Mutex?

The Intent Mutex is an automatic anti-race condition guard. When an LLM hallucinates and fires identical destructive calls simultaneously (e.g. double-deleting a user), the framework serializes them into a strict FIFO queue to guarantee transactional isolation. It activates automatically on any action marked with destructive: true.

Runtime Guards

Prerequisites

Install Vurb.ts before following this guide: npm install @vurb/core @modelcontextprotocol/sdk zod — or scaffold a project with vurb create.

TELL YOUR AI AGENT

"Add concurrency(max:5, queue:20) and egress(2MB) guards to the analytics tool, and use f.mutation() for all destructive actions to enable intent mutex."

Open in Claude Open in ChatGPT

INFRASTRUCTURE SAFETY

Agents burst. Systems crash.
Three guards prevent it.

Concurrency semaphore, egress byte limiter, and intent mutex — three built-in guards that protect your infra without touching the hot path when unconfigured.

Introduction

AI agents can fire tens of tool calls in rapid succession — burst invocations during chain-of-thought reasoning, oversized responses from unbounded queries, duplicate destructive calls when self-correcting. Without protection, a single LLM session can exhaust your database pool, crash Node.js with a 50MB response, or double-delete a user.

Vurb.ts provides three built-in runtime guards. Each has zero overhead when not configured — no conditionals in the hot path.

Concurrency Guard

Limits simultaneous executions per tool with a semaphore, backpressure queue, and load shedding:

typescript

import { initVurb } from '@vurb/core';

const f = initVurb<AppContext>();

const heavyReport = f.query('analytics.heavy_report')
  .describe('Generate a comprehensive analytics report')
  .concurrency({ max: 5, queueSize: 20 })
  .withString('range', 'Date range')
  .handle(async (input, ctx) => {
    return ctx.db.analytics.generateReport(input.range);
  });

When all 5 slots are occupied and the queue has space, the call waits in FIFO order. When the queue is also full, it's immediate rejection:

xml

<tool_error>
  <error_code>SERVER_BUSY</error_code>
  <message>Tool "analytics" is at capacity (5 active, 20 queued).</message>
  <suggestion>Reduce concurrent calls. Send requests sequentially.</suggestion>
</tool_error>

Slots are freed in a try/finally — even if the handler throws or the abort signal fires. Queued waiters abort immediately on signal cancellation. The internal queue is deque-based for O(1) acquire/release.

Egress Guard

Prevents oversized responses from crashing Node.js or overflowing the LLM context window:

typescript

const logSearch = f.query('logs.search')
  .describe('Search application logs')
  .egress(2 * 1024 * 1024)   // 2 MB max
  .withString('query', 'Search query')
  .handle(async (input, ctx) => {
    return ctx.db.logs.search(input.query);
  });

When exceeded, it truncates at a safe UTF-8 character boundary and injects:

[SYSTEM INTERVENTION: Payload truncated at 2.0MB to prevent memory crash.
You MUST use pagination (limit/offset) or filters to retrieve smaller result sets.]

Egress Guard vs Presenter `.limit()` / `.agentLimit()`

Both truncate at different layers. Use both for defense in depth:

typescript

import { createPresenter, t } from '@vurb/core';

// Domain guard — intelligent truncation with custom message
const UserPresenter = createPresenter('User')
  .schema({ id: t.string, name: t.string, email: t.string })
  .limit(50);

// Infrastructure guard — brute-force byte limit
const listUsers = f.query('users.list')
  .describe('List all users in the workspace')
  .egress(2 * 1024 * 1024)
  .returns(UserPresenter)
  .handle(async (input, ctx) => ctx.db.users.findMany());

.limit() / .agentLimit() operates on item count at the domain layer with custom guidance. .egress() operates on raw bytes at the infrastructure layer as a safety net.

TIP

Presenter .limit() is the first line of defense — it truncates intelligently with domain context. Egress guard is the last-resort safety net for edge cases where the Presenter can't predict payload size (e.g. text blobs, nested data).

Intent Mutex

Serializes destructive actions automatically — no configuration needed. When an LLM fires two delete_user calls for the same ID in the same millisecond, both would normally execute before either returns. The intent mutex prevents this:

typescript

const deleteUser = f.mutation('users.delete')
  .describe('Permanently delete a user and all their data')
  .withString('id', 'User ID to delete')
  .handle(async (input, ctx) => {
    await ctx.db.users.delete({ where: { id: input.id } });
    return { deleted: input.id };
  });

const listUsers = f.query('users.list')
  .describe('List all users')
  .handle(async (input, ctx) => ctx.db.users.findMany());

users.delete calls (from f.mutation()) execute in strict FIFO order. users.list (from f.query()) runs in parallel — zero overhead from the mutex. Serialization uses the action key as the lock key, so concurrent calls to different destructive actions run independently. The underlying async mutex uses promise chaining — no external locks, no Redis.

Combined Configuration

All three guards compose naturally:

typescript

const analyticsQuery = f.query('analytics.query')
  .describe('Run a custom analytics query')
  .concurrency({ max: 3, queueSize: 10 })
  .egress(2 * 1024 * 1024)
  .withString('sql', 'SQL query')
  .withOptionalNumber('limit', 'Max rows (default 100)')
  .handle(async (input, ctx) => {
    return ctx.db.$queryRaw(input.sql);
  });

3 concurrent queries max, 10 queued, 2MB response cap. The intent mutex is automatic for any f.mutation() tool.

Testing

typescript

import { describe, it, expect } from 'vitest';
import { initVurb, success } from '@vurb/core';

const f = initVurb<void>();

describe('Runtime Guards', () => {
  it('load-sheds when at capacity', async () => {
    const tool = f.query('billing.charge')
      .describe('Charge billing')
      .concurrency({ max: 1, queueSize: 0 })
      .handle(async () => {
        await new Promise(r => setTimeout(r, 100));
        return success('charged');
      });

    const first = tool.execute(undefined, { action: 'default' });
    const second = await tool.execute(undefined, { action: 'default' });

    expect(second.isError).toBe(true);
    expect(second.content[0].text).toContain('SERVER_BUSY');
    expect((await first).isError).toBeUndefined();
  });

  it('truncates oversized responses', async () => {
    const tool = f.query('logs.search')
      .describe('Search logs')
      .egress(2048)
      .handle(async () => success('x'.repeat(10_000)));

    const result = await tool.execute(undefined, { action: 'default' });
    expect(result.content[0].text).toContain('[SYSTEM INTERVENTION');
  });
});

Core

Other

Prompt

Resources

StateSync

Other

Sandbox

Client

Core

Credentials

Domain Models

FHP

FSM

Governance

Model

Observability

Presenter

Prompt

Resources

Sandbox

Security

Serialization

Server

StateSync

Runtime Guards

Introduction

Concurrency Guard

Egress Guard

Egress Guard vs Presenter `.limit()` / `.agentLimit()`

Intent Mutex

Combined Configuration

Testing

Runtime Guards ​

Introduction ​

Concurrency Guard ​

Egress Guard ​

Egress Guard vs Presenter .limit() / .agentLimit() ​

Intent Mutex ​

Combined Configuration ​

Testing ​

Runtime Guards

Introduction

Concurrency Guard

Egress Guard

Egress Guard vs Presenter `.limit()` / `.agentLimit()`

Intent Mutex

Combined Configuration

Testing