All posts

Tool bloat doesn't just cost money. It makes your agent dumber.

Every MCP server you connect dumps its full tool list into context on every turn. That isn't only a bigger bill, it's your model spending its sharpest attention on a list of tools before it reads your request. Here is why that degrades the whole system, with the graded proof that fixing it costs you nothing.

Here is the thing nobody tells you when you wire up your tenth MCP server: your agent just got a little dumber.

Not because the tools are bad. Because of where they live.

The tax you can’t see

Every MCP server you connect dumps its entire tool list into your agent’s context on every single request. Three servers can be around 24,000 tokens of definitions before you have typed a word. Connect ten and you are carrying a small novel of tool schemas into every turn.

The obvious cost is money: those are input tokens, and you pay them on every call. Our calculator will show you the monthly bill, and it is bigger than people expect.

But the bill is not the real problem.

Context is where your model thinks

A context window is not storage. It is the space the model actually reasons in, and it has a quality budget, not just a size limit. As you fill it, accuracy drops, well before you hit the big number printed on the box. Researchers have a name for it now: context rot. “Lost in the Middle” showed models reliably miss information buried in a long context, and more recent work shows the degradation grows steadily as the input does.

So picture what tool bloat actually does. Before your agent reads your request, it spends a slice of its sharpest attention reading a list of every tool it might use. On a hard task, that can be the difference between getting it right and getting it almost right.

You have felt this and blamed the model. Sometimes it was the tool list.

The part that is hard to copy

The move is simple to say and harder to prove: cut the bloat without losing the tools.

Toolport puts one gateway in front of every server and advertises three meta-tools instead of hundreds. Your agent searches for the tool it needs and calls it on demand. No tool is hidden; the full set is one search away. Context stays flat no matter how many servers you connect.

The fair question is whether the agent gets worse when it can’t see every tool up front. So we measured it, end to end, on a frontier model, graded for correct answers and not just task completion: up to 91% fewer tokens at the same task success. Same answers, a fraction of the context.

That last part is the one that matters. Anyone can save you tokens by hiding things. The proof that you did not lose anything is the receipt.

What you get back

  • A sharper agent, because its context is not full of tools before you have asked anything.
  • A flatter bill, steady as you add servers instead of climbing with each one.
  • Often a cheaper model. A lot of the reason people reach for the biggest model is to brute-force through a cluttered context. Clean it up and the smaller one frequently keeps pace.

Set up your tools once. Let your agent reach for them when it needs them. Give the context budget back to the work.

Try Toolport, free, local, and your keys never leave your machine.

Download Toolport Star on GitHub