How did MCPs evolve?

The Problem

The model needs a way to discover which MCP tools are available. In a typical setup, agents would load all MCP tools definitions upfront into the model’s context so it can pick the right tool via direct tool calls, often before the user even writes the first prompt.

The agent has now the tools definitions and can decide which one will pick for the next task. Not rare, the agent needs more tools to complete the same task and then comes the agent loop: after one MCP tool runs, its result is injected back into the context, the model reads it, and decides the next tool call. This can duplicate large payloads across steps.

Anthropic illustrates this with a common workflow: the agent pulls a long meeting transcript from Google Drive, the full transcript lands in the model context, and then the model has to re-use that content again to update something like a Salesforce record. In other words, the same big document ends up being carried through multiple steps as tokens, which is expensive and brittle.

The improvement

Changing the architectural design. Instead of the model calling tools directly (tool call → JSON back into context → model decides the next call), the model writes a small script and runs it in a separate code execution environment. This way, the heavy lift is done by the execution environment, not by the agent context space. The execution environment will make the MCP tools calls, transform/filter outputs, and return only the final result back to the model, which is what matters.
They stop treating MCP like one giant menu. Instead of loading every MCP tool definition up front, the agent should start seeing the MCP servers like a browsable structure (like a file tree). Then the model could navigate and read only the tool definitions necessary for the current task, on-demand. This idea aligns with the famous CloudFare blog post.
Implementing the idea with ‘tool discovery’. Later, Anthropic released the Tool Search Tool (public beta on Nov 24, 2025, now promoted to GA), which basically provides a tool search capability to the model. Instead of preloading every tool definition, now the model can search only what’s relevant and only pulls the necessary tool descriptions into the context, which is what makes agents capable of working with many tools.

The evolution

Now Anthropic made programmatic tool calling Generally Available on February 17, 2026.

With programmatic tool calling, all the tools starts going to the same direction as the newer MCP tool usage pattern. Given your defined tools (whether they come from MCP servers or from local/user defined tools), when the agent decides it needs a tool, instead of directly doing function calling, it will generate the necessary code to execute it inside a controlled execution environment, and that code will be responsible to call the tools, handle loops, retries, joins results, etc, and then return only the final output back to the model.

The Problem

The improvement

The evolution

Sources