What is TOON encoding in Vurb.ts?

TOON (Token-Oriented Object Notation) is a compact serialization that reduces token count by ~40% vs standard JSON. Use toonSuccess(data) instead of success(data). Strips quotes, uses shorthand, minimizes whitespace while remaining LLM-parseable.

How does freeze-after-build work?

After .buildToolDefinition(), the builder is frozen with Object.freeze(). No further modifications possible. Prevents accidental mutation of tool definitions at runtime, ensuring deterministic behavior.

How does .agentLimit() reduce costs?

Without limits, 10,000 rows at ~500 tokens each costs ~$2.40 per call. With .agentLimit(50), capped at 50 rows (~$0.02) plus filter guidance. 100x cost reduction per call.

When should I use tag filtering for performance?

When you have many tools but only a subset is relevant per session. Each tool definition consumes prompt tokens. Filtering to relevant tags reduces prompt size and improves LLM accuracy on tool selection.

How do pre-compiled middleware chains improve performance?

Traditional middleware resolves the chain at every request. Vurb.ts compiles once at build time. For 5 middleware functions, eliminates 5 function lookups per request — operations that add up at thousands of requests per second.

Scaling

Prerequisites

Install Vurb.ts before following this guide: npm install @vurb/core @modelcontextprotocol/sdk zod — or scaffold a project with vurb create.

Introduction
Grouping Reduces Tool Count
Tag Filtering
TOON Token Compression
Strict Validation
Error Recovery

Introduction

Every tool definition in tools/list includes a name, description, and full JSON Schema. The LLM receives this entire payload as system context. As tool count grows, three failures cascade: context saturation (fewer tokens for reasoning), semantic collision (similar tool names confuse routing), and parameter confusion (overlapping field names like id or status cause cross-contamination).

Vurb.ts provides four mechanisms to keep tool payloads manageable as your server scales — especially critical when using generators like @vurb/prisma-gen, @vurb/openapi-gen, or @vurb/n8n that can produce dozens of tools from a single schema.

Grouping Reduces Tool Count

Use the grouped exposition strategy to consolidate multiple operations behind a single discriminator enum. Instead of 5 entries in tools/list:

json

[
  { "name": "projects_list", "inputSchema": { /* ... */ } },
  { "name": "projects_get", "inputSchema": { /* ... */ } },
  { "name": "projects_create", "inputSchema": { /* ... */ } }
]

One entry with all operations nested:

json

[
  {
    "name": "projects",
    "inputSchema": {
      "properties": {
        "action": { "type": "string", "enum": ["list", "get", "create"] },
        "id": { "description": "Project ID. Required for: get" },
        "name": { "description": "Project name. Required for: create" }
      },
      "required": ["action"]
    }
  }
]

The discriminator enum anchors the LLM to valid operations. If it sends an invalid action, Vurb.ts returns a structured error with the valid options.

Tag Filtering

.tags() on the Fluent API lets you classify tools, then filter which ones appear in tools/list:

typescript

import { initVurb } from '@vurb/core';
const f = initVurb<AppContext>();

const usersTool = f.query('users.list')
  .describe('List users')
  .tags('core', 'user-management')
  .handle(async (input, ctx) => { /* ... */ });

typescript

registry.attachToServer(server, {
  contextFactory: createAppContext,
  filter: {
    tags: ['core'],
    exclude: ['internal'],
  },
});

Filtered tools consume zero tokens. If the LLM attempts to call a hidden tool, routeCall() returns "Unknown tool".

TOON Token Compression

.toonDescription() encodes action metadata using pipe-delimited formatting, reducing description tokens by 30-50%:

text

Manage projects

action|desc|required|destructive
list|List all projects||
get|Get project details|id|
create|Create a new project|name|
update|Update project|id,data|
delete|Delete project permanently|id|true

Column names appear once as a header. No JSON key repetition per row.

TIP

Use TOON for servers with 20+ actions sharing the same tool. Below that threshold, standard Markdown descriptions are more readable for humans.

Strict Validation

Every action schema is compiled with .strict(). When the LLM sends undeclared fields, Zod rejects them with an actionable error naming the invalid fields:

xml

<validation_error action="users/create">
  <field name="(root)">Unrecognized key(s) in object: 'hallucinated_param'. Remove or correct unrecognized fields.</field>
  <recovery>Fix the fields above and call the tool again.</recovery>
</validation_error>

The LLM sees exactly which fields are invalid and self-corrects on retry.

Error Recovery

Structured error responses let the LLM self-correct without retry loops. Every validation bounce includes valid options or the specific field that failed. See Error Handling for the full reference.

Scale Beyond a Single Process

Token compression and tool grouping reduce cognitive load — but your MCP server still runs as a single Node.js process. To scale horizontally without managing infrastructure, deploy to serverless runtimes where each invocation runs in its own isolate.

Vurb.ts's adapters cache registry compilation at module scope — Zod reflection, Presenter compilation, schema generation — and execute warm requests as stateless JSON-RPC calls. No shared memory, no session affinity, no connection pooling.

Vercel — Auto-Scaling MCP Functions

Each invocation compiles tools once at cold start and reuses the cached registry for subsequent calls. Edge Runtime distributes your MCP server globally with ~0ms cold starts:

typescript

import { vercelAdapter } from '@vurb/vercel';
export const POST = vercelAdapter({ registry, contextFactory });
export const runtime = 'edge';

Cloudflare Workers — Isolate-per-Request Architecture

Workers spawn a V8 isolate per request — true horizontal scaling with zero coordination. Your tools access D1 and KV at the edge without cross-isolate state:

typescript

import { cloudflareWorkersAdapter } from '@vurb/cloudflare';
export default cloudflareWorkersAdapter({ registry, contextFactory });

Full deployment guides: Vercel Adapter · Cloudflare Adapter · Production Server

Core

Other

Prompt

Resources

StateSync

Other

Sandbox

Client

Core

Domain Models

FSM

Governance

Observability

Presenter

Prompt

Resources

Sandbox

Security

Serialization

Server

StateSync

Scaling

Introduction

Grouping Reduces Tool Count

Tag Filtering

TOON Token Compression

Strict Validation

Error Recovery

Scale Beyond a Single Process

Vercel — Auto-Scaling MCP Functions

Cloudflare Workers — Isolate-per-Request Architecture

Scaling ​

Introduction ​

Grouping Reduces Tool Count ​

Tag Filtering ​

TOON Token Compression ​

Strict Validation ​

Error Recovery ​

Scale Beyond a Single Process ​

Vercel — Auto-Scaling MCP Functions ​

Cloudflare Workers — Isolate-per-Request Architecture ​

Scaling

Introduction

Grouping Reduces Tool Count

Tag Filtering

TOON Token Compression

Strict Validation

Error Recovery

Scale Beyond a Single Process

Vercel — Auto-Scaling MCP Functions

Cloudflare Workers — Isolate-per-Request Architecture