Scaling
Prerequisites
Install Vurb.ts before following this guide: npm install @vurb/core @modelcontextprotocol/sdk zod — or scaffold a project with vurb create.
- Introduction
- Grouping Reduces Tool Count
- Tag Filtering
- TOON Token Compression
- Strict Validation
- Error Recovery
Introduction
Every tool definition in tools/list includes a name, description, and full JSON Schema. The LLM receives this entire payload as system context. As tool count grows, three failures cascade: context saturation (fewer tokens for reasoning), semantic collision (similar tool names confuse routing), and parameter confusion (overlapping field names like id or status cause cross-contamination).
Vurb.ts provides four mechanisms to keep tool payloads manageable as your server scales — especially critical when using generators like @vurb/prisma-gen, @vurb/openapi-gen, or @vurb/n8n that can produce dozens of tools from a single schema.
Grouping Reduces Tool Count
Use the grouped exposition strategy to consolidate multiple operations behind a single discriminator enum. Instead of 5 entries in tools/list:
[
{ "name": "projects_list", "inputSchema": { /* ... */ } },
{ "name": "projects_get", "inputSchema": { /* ... */ } },
{ "name": "projects_create", "inputSchema": { /* ... */ } }
]One entry with all operations nested:
[
{
"name": "projects",
"inputSchema": {
"properties": {
"action": { "type": "string", "enum": ["list", "get", "create"] },
"id": { "description": "Project ID. Required for: get" },
"name": { "description": "Project name. Required for: create" }
},
"required": ["action"]
}
}
]The discriminator enum anchors the LLM to valid operations. If it sends an invalid action, Vurb.ts returns a structured error with the valid options.
Tag Filtering
.tags() on the Fluent API lets you classify tools, then filter which ones appear in tools/list:
import { initVurb } from '@vurb/core';
const f = initVurb<AppContext>();
const usersTool = f.query('users.list')
.describe('List users')
.tags('core', 'user-management')
.handle(async (input, ctx) => { /* ... */ });registry.attachToServer(server, {
contextFactory: createAppContext,
filter: {
tags: ['core'],
exclude: ['internal'],
},
});Filtered tools consume zero tokens. If the LLM attempts to call a hidden tool, routeCall() returns "Unknown tool".
TOON Token Compression
.toonDescription() encodes action metadata using pipe-delimited formatting, reducing description tokens by 30-50%:
Manage projects
action|desc|required|destructive
list|List all projects||
get|Get project details|id|
create|Create a new project|name|
update|Update project|id,data|
delete|Delete project permanently|id|trueColumn names appear once as a header. No JSON key repetition per row.
TIP
Use TOON for servers with 20+ actions sharing the same tool. Below that threshold, standard Markdown descriptions are more readable for humans.
Strict Validation
Every action schema is compiled with .strict(). When the LLM sends undeclared fields, Zod rejects them with an actionable error naming the invalid fields:
<validation_error action="users/create">
<field name="(root)">Unrecognized key(s) in object: 'hallucinated_param'. Remove or correct unrecognized fields.</field>
<recovery>Fix the fields above and call the tool again.</recovery>
</validation_error>The LLM sees exactly which fields are invalid and self-corrects on retry.
Error Recovery
Structured error responses let the LLM self-correct without retry loops. Every validation bounce includes valid options or the specific field that failed. See Error Handling for the full reference.
Scale Beyond a Single Process
Token compression and tool grouping reduce cognitive load — but your MCP server still runs as a single Node.js process. To scale horizontally without managing infrastructure, deploy to serverless runtimes where each invocation runs in its own isolate.
Vurb.ts's adapters cache registry compilation at module scope — Zod reflection, Presenter compilation, schema generation — and execute warm requests as stateless JSON-RPC calls. No shared memory, no session affinity, no connection pooling.
Vercel — Auto-Scaling MCP Functions
Each invocation compiles tools once at cold start and reuses the cached registry for subsequent calls. Edge Runtime distributes your MCP server globally with ~0ms cold starts:
import { vercelAdapter } from '@vurb/vercel';
export const POST = vercelAdapter({ registry, contextFactory });
export const runtime = 'edge';Cloudflare Workers — Isolate-per-Request Architecture
Workers spawn a V8 isolate per request — true horizontal scaling with zero coordination. Your tools access D1 and KV at the edge without cross-isolate state:
import { cloudflareWorkersAdapter } from '@vurb/cloudflare';
export default cloudflareWorkersAdapter({ registry, contextFactory });Full deployment guides: Vercel Adapter · Cloudflare Adapter · Production Server