MCP Server Anti-Patterns to Avoid

The easiest way to ship a broken MCP server is to expose every available endpoint as a separate tool and hope the model figures out how to sequence them.

Endpoint-for-endpoint tool mapping overloads the context window with repeated descriptions and duplicate IDs.
Raw API responses can easily consume well over 180,000 tokens in a single call, effectively filling an entire context window.
Without explicit sequencing logic, the model is left guessing the right order, which leads to unreliable answers and failed tool chains.

This lesson is a preview from our Building Your First MCP Server and Client Course Online. Enroll in a course for detailed lessons, live instructor support, and project-based training.

Most of the bad MCP servers in the wild share the same origin story. Someone points at an existing API, wraps each endpoint in a tool, attaches a short description, and ships it. On paper, that seems efficient. In practice, it produces a server that drowns the model in data, provides no reliable path through that data, and forces users to babysit every interaction.

Why This Anti-Pattern Keeps Happening

The temptation is understandable. If there is already a running API with documented endpoints, wrapping those endpoints feels like the fastest way to get something functional. The problem is that an API designed for a traditional backend is almost never shaped the way a language model wants to consume it. Endpoints are granular, responses are verbose, and nothing in the raw design tells the model which call should come first.

When every call is a separate tool, you end up with a long list of similar-looking options. Each tool carries its own description, and that description is loaded into context every turn. Before the user has even asked a question, the working memory is already crowded with metadata.

What Overload Looks Like in Practice

Consider a SpaceX-style API as a concrete example. A single call to list launches can return hundreds of flights, each with duplicate IDs, images, Wikipedia links, and redundant unit conversions. Passing that response straight to a model measures out to roughly 184,000 tokens. On a model with a 200,000 token window, that one call swallows almost the entire budget.

Most of those tokens are pure noise from the model's perspective. It does not need image URLs, it does not need both meters and feet, and it rarely needs every legacy ID. Yet every one of those fields is billed on input, and every one of them steals attention from the content that actually matters.

The Sequencing Problem

Once you have a pile of narrow tools, you also have a coordination problem. Answering a question like, who was on the crew of the last crewed mission, requires finding the most recent crewed launch, picking up its ID, pulling crew references, and resolving each of those to a person. No single tool expresses that chain. The server is silently asking the model to reconstruct the logic on its own, and the results vary from run to run.

Patching the system prompt to describe the order usually fails. There is no guarantee the model will follow it, and every new workflow introduces another brittle instruction.

What to Do Instead

The fix is to design outcome-based tools that package the work on the server side. A single mission briefing call can pull the launch, rocket, launchpad, and crew data behind the scenes and return a clean markdown summary. The context window stays small, the model does not have to orchestrate anything, and the answers become predictable.

Think of the bad version as a warning. It shows what happens when a server respects the upstream API more than the downstream model. The good version starts from the opposite angle: figure out what users need, then shape the tools to deliver exactly that.

Wrapping every endpoint as its own tool is the single biggest anti-pattern in MCP server design. Strip away what the model does not need, bundle related calls into purposeful outcomes, and keep your tool surface small and intentional. Respect the context window, and the server becomes a tool the model can actually use.

Why This Anti-Pattern Keeps Happening

What Overload Looks Like in Practice

The Sequencing Problem

What to Do Instead

Michiel Pruijssers

How to Learn AI