Skip to main content

Command Palette

Search for a command to run...

The Model Context Protocol (MCP): Building the USB for the Agentic Era

Published
12 min read

The Model Context Protocol (MCP): Building the USB for the Agentic Era

Introduction: The Fragmented Reality of AI Infrastructure

In the last two years, we have moved from "chatting with a model" to "building agents that operate on our behalf." However, we have hit an invisible wall: data and tool fragmentation. Every time we want an LLM (Large Language Model) to access a database, read a local file, or interact with a third-party API, we end up writing ad-hoc "glue code."

We have created context silos. My Slack agent doesn't know what my GitHub agent is doing, and neither has a standard way to ask for my production database schema without me implementing a specific endpoint for it.

The Evolution of Integration: From SOAP to MCP

To understand why MCP is revolutionary, we must look at the history of how we've connected systems. In the early 2000s, we had SOAP—heavy, XML-based, and rigid. Then came REST, which simplified the web with JSON and HTTP verbs. While REST was a leap forward for human-driven frontend-to-backend communication, it wasn't designed for non-deterministic agents.

When we give an LLM a REST API, we are essentially asking it to read a manual (the API docs) and then write the code to call it. This is inefficient. The LLM has to parse the documentation, figure out the parameters, and then generate a call that we, the developers, have to catch, execute, and return.

MCP short-circuits this. Instead of the LLM guessing how to call an API, the MCP Server describes itself to the LLM in a way that the model natively understands. It's not just an API; it's a contextual interface.

The Contextual Technical Debt

Every custom integration we build for an AI agent is a piece of technical debt. If tomorrow we switch from GPT-4o to Claude 3.5 Sonnet, or if we decide to use a local model like Llama 3, many of our custom integrations (which depend on how a specific model "understands" tool system prompts) could fail or behave erratically.

MCP standardizes the "contract" between the model and the tool. By adopting MCP, we are decoupling our business capabilities from the whims of model updates. As Staff Engineers, our priority is long-term stability and interoperability. MCP gives us an abstraction layer that will survive the next generation of LLMs.


1. What is MCP? The "USB" Analogy

Imagine a world before USB. If you had a printer, you needed a parallel port. If you had a mouse, a serial port. If you had a joystick, a game port. MCP eliminates this friction for AI agents.

MCP is an open protocol that allows developers to build tool and data servers that can be consumed by any MCP-compatible client (like Claude Desktop, IDEs, or your own agent orchestrators).

The Core Specification

At its heart, MCP is a JSON-RPC 2.0 based protocol. It defines a clear separation between:

  1. MCP Hosts: The applications (like an IDE or a CLI) that want to provide context to a model.
  2. MCP Clients: The interface within the host that initiates the connection.
  3. MCP Servers: The specialized services that expose specific resources (data), prompts (templates), and tools (executable functions).
+-----------------+      +-----------------+      +-------------------+
|    MCP Host     |      |   MCP Client    |      |    MCP Server     |
| (Claude, IDE)   |<---->|  (Internal)     |<---->| (GitHub, SQL, FS) |
+-----------------+      +-----------------+      +-------------------+
        ^                         |                        |
        |                         |                        |
        +-------------------------+------------------------+
                  Standardized JSON-RPC Communication

2. Deep Dive: The Architecture

For a Staff Engineer, the value of MCP is not just in convenience, but in infrastructure abstraction. MCP defines three main primitives:

  1. Resources: Read-only data. Think of this as the "GET" of REST. It can be file content, a SQL table row, or system logs.
  2. Prompts: Pre-configured templates that help the model understand how to interact with the data.
  3. Tools: Executable functions. This is where the magic happens. The model decides which tool to call, and the MCP server executes the logic in the environment where the data resides.

JSON-RPC: The Wire Protocol

MCP isn't magic; it's engineering. It uses JSON-RPC 2.0, a stateless, light-weight remote procedure call protocol. Let's look at what's actually happening on the wire.

When a host (like an IDE) wants to know what a server can do, it sends a list_tools request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/list",
  "params": {}
}

The server responds with a structured schema:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "tools": [
      {
        "name": "query_inventory",
        "description": "Checks the stock levels for a specific SKU",
        "inputSchema": {
          "type": "object",
          "properties": {
            "sku": { "type": "string" }
          },
          "required": ["sku"]
        }
      }
    ]
  }
}

This strict schema adherence is what makes MCP robust. The model isn't just "guessing" parameters; it is constrained by the JSON Schema provided by the server.

Transport Layers: Stdio vs. SSE

One of the most brilliant design decisions of MCP is transport flexibility.

  • Stdio (Standard Input/Output): Ideal for local agents. The host (like Claude Desktop or VS Code) starts the server as a child process. Communication occurs via stdin and stdout. It's incredibly fast, secure because it happens locally, and requires no network configuration, firewalls, or complex API tokens. It's the equivalent of plugging a keyboard directly into a USB port.
  • SSE (Server-Sent Events): Designed for remote servers. It allows a persistent connection over HTTP. The client sends requests via POST and receives server updates through an SSE stream. This is ideal for cloud-based tools or when you want multiple agents to share the same centralized context server.

3. Why This Matters for Infrastructure Design

In traditional agent architecture, the flow was: User -> Orchestrator -> Custom Tooling -> LLM -> Custom Output Parser -> Execution.

With MCP, infrastructure is drastically simplified. The orchestrator no longer needs to know the implementation details of every tool. It only needs to be an MCP Host.

Decoupling Context from Logic

MCP allows us to decouple the reasoning (LLM) from the environment (Data/Tools). This means:

  1. Security by Isolation: Your MCP Server can run in a restricted Docker container or a specific VPC, exposing only the necessary tools via the protocol.
  2. Hot-Swapping Tools: You can update an MCP Server (adding new tools or updating logic) without changing a single line of code in your agent's core or changing the system prompt. The model "discovers" the new capabilities upon connection.

4. Implementation: Building an "Epic" MCP Server

Technical Example: Kubernetes Telemetry Server

## mcp_k8s_server.py
import asyncio
from mcp.server.models import InitializationOptions
from mcp.server import Notification, Server
from mcp.server.stdio import stdio_server
import mcp.types as types

## Initialize the MCP Server
server = Server("k8s-telemetry-pro")

@server.list_tools()
async def handle_list_tools() -> list[types.Tool]:
    """List available Kubernetes diagnostic tools."""
    return [
        types.Tool(
            name="get_pod_logs",
            description="Fetches logs for a specific pod in a namespace",
            inputSchema={
                "type": "object",
                "properties": {
                    "pod_name": {"type": "string"},
                    "namespace": {"type": "string", "default": "default"},
                    "tail_lines": {"type": "integer", "default": 100}
                },
                "required": ["pod_name"]
            }
        ),
        types.Tool(
            name="analyze_cluster_health",
            description="Runs a comprehensive health check on the cluster nodes",
            inputSchema={
                "type": "object",
                "properties": {}
            }
        )
    ]

@server.call_tool()
async def handle_call_tool(
    name: str, 
    arguments: dict | None
) -> list[types.TextContent | types.ImageContent | types.EmbeddedResource]:
    if name == "get_pod_logs":
        pod = arguments.get("pod_name")
        ns = arguments.get("namespace")
        # Imagine actual k8s API logic here
        return [types.TextContent(type="text", text=f"Logs from {pod} in {ns}: [STDOUT] Service started...")]

    if name == "analyze_cluster_health":
        return [types.TextContent(type="text", text="Cluster Health: 98%. 2 nodes showing high memory pressure.")]

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await server.run(
            read_stream,
            write_stream,
            InitializationOptions(
                server_name="k8s-telemetry-pro",
                server_version="1.0.0",
                capabilities=server.get_capabilities(
                    notification_options=Notification.options(),
                    experimental_capabilities={},
                ),
            ),
        )

if __name__ == "__main__":
    asyncio.run(main())

TypeScript Implementation Example

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";

const server = new Server(
  {
    name: "internal-api-gateway",
    version: "2.1.0",
  },
  {
    capabilities: {
      tools: {},
    },
  }
);

server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [
      {
        name: "get_user_metrics",
        description: "Retrieve engagement metrics for a user by ID",
        inputSchema: {
          type: "object",
          properties: {
            userId: { type: "string" },
            period: { type: "string", enum: ["7d", "30d", "90d"] },
          },
          required: ["userId"],
        },
      },
    ],
  };
});

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === "get_user_metrics") {
    const { userId, period } = request.params.arguments as { userId: string, period: string };

    // Logic to call your real internal microservice
    console.error(`Fetching metrics for ${userId} for ${period}`);

    return {
      content: [
        {
          type: "text",
          text: JSON.stringify({
            userId,
            engagementScore: 88,
            status: "active",
            lastSeen: new Date().toISOString()
          }),
        },
      ],
    };
  }

  throw new Error("Tool not found");
});

const transport = new StdioServerTransport();
await server.connect(transport);

This Node.js server can be packaged as an executable or a Docker container and connected to any MCP host in seconds.


5. Standardizing Tools: The Death of Ad-hoc API Wrappers

The End of Custom Integration Hell

Before MCP, if you wanted your agent to use Google Drive, Slack, and your internal PostgreSQL, you had to write three different authentication flows and three different tool-calling schemas.

With the MCP ecosystem, we are seeing a "Marketplace of Context".

  • Need GitHub integration? Use the official mcp-server-github.
  • Need Google Maps? Use mcp-server-google-maps.
  • Need local file access? Use mcp-server-filesystem.

6. Security and Governance at Scale

One of the biggest fears in companies is: "What if the agent deletes the database?".

MCP offers several layers of protection:

  1. Tool Granularity: The MCP server only exposes what you define. You can have a "Read-Only" server for production.
  2. Execution Isolation: The server runs where the data is, not where the LLM runs. The LLM never sees your database credentials; it only sees the JSON-RPC results.
  3. Audit Logs: By centralizing access through MCP servers, you can audit every call the model makes to your internal systems.

Zero-Trust Context

In an enterprise environment, MCP servers can be treated as microservices. You can apply standard AuthZ/AuthN patterns to the SSE transport layer, ensuring that only authorized agents can access sensitive corporate data.


7. The Conceptual Diagram: The Agentic OS

Think of the LLM as the CPU, and MCP as the motherboard's bus.

+-----------------------------------------------------------+
|                      USER INTERFACE                       |
|           (Chat, CLI, Automation Workflow)                 |
+---------------------------+-------------------------------+
                            |
           +----------------v----------------+
           |           AGENT HOST            |
           |   (Reasoning & Orchestration)   |
           +----------------+----------------+
                            |
        +-------------------v-------------------+
        |        MODEL CONTEXT PROTOCOL         |
        |  (The Standardized Interface / USB)   |
        +---------+---------+---------+---------+
                  |         |         |
      +-----------v---+ +---v-------+ +---v-----------+
      |  DATA SERVER  | | TOOL SERVER | | PROMPT SERVER |
      | (Postgres, S3)| | (CI/CD, API)| | (Knowledge B) |
      +---------------+ +-------------+ +---------------+

8. Deployment Strategies: From Local Dev to Production

As platform engineers, we need to think about how this scales. We aren't going to ask every developer to manually configure their claude_desktop_config.json files. We need a way to distribute capabilities.

  1. The Sidecar Pattern: In Kubernetes environments, you can deploy MCP servers as sidecars next to your application pods. This allows the agent (which might be running in a separate pod) to access local resources without exposing them to the public network.
  2. MCP Gateways: We can build a "Context Gateway." A single entry point (via SSE) that routes requests to multiple internal MCP servers based on the model's needs.
  3. Local Stdio for CLI Tools: For DevOps engineers, Stdio-based MCP servers are transformative. You can have an MCP server that wraps your Terraform scripts or kubectl commands, allowing a terminal agent to perform complex operations with human supervision.

Operationalizing MCP

Monitoring is key. Since MCP uses JSON-RPC over Stdio or HTTP, we can intercept the streams for logging and observability. We can track:

  • Token Usage per Tool: Which tools are consuming the most tokens in prompts/responses?
  • Latency: How long is the local database query taking compared to the LLM's reasoning time?
  • Error Rates: Are models consistently providing invalid arguments for a specific tool? This is a signal to improve the tool's description or schema.

9. Advanced Patterns: Multi-Server Routing and Resource Templates

MCP is not limited to "tools." Resources are just as powerful. A resource can be dynamic. For example, you can expose a URI schema like logs://pod-name/container-name. When the model sees a reference to a pod, it can "open" that resource.

@server.list_resources()
async def handle_list_resources() -> list[types.Resource]:
    return [
        types.Resource(
            uri="db://inventory/schema",
            name="Current Database Schema",
            description="The live schema of the inventory database",
            mimeType="application/json"
        )
    ]

This allows the model to have a "map" of the world before it starts executing tools. It's the difference between entering a dark room and having a floor plan of the building.

The Power of Contextual Prompts

MCP also allows servers to expose Prompts. These aren't just strings; they are templates that can take arguments. Example: A "code-review" prompt that pulls in the latest PR diff and the company's style guide as context. Instead of the user writing a long prompt, they simply select the analyze-pr prompt from the MCP server.


10. Why MCP is the "USB of AI"

  1. Plug and Play: You connect a server and the model "learns" new skills instantly.
  2. Universality: It doesn't matter if you use Claude, GPT-4, or Llama 3 (via a compatible host); the protocol is the same.
  3. Simplicity: It's based on proven technologies: JSON-RPC, Stdio, and HTTP/SSE.
  4. Extensibility: Anyone can write a server in any language that supports JSON-RPC.

11. Security, Privacy, and the "Human-in-the-loop"

In the era of autonomous agents, control is the most valuable currency. MCP facilitates the implementation of Human-in-the-loop (HITL) policies.

Because the Host (the client application) is what ultimately executes the tool call suggested by the model, the Host can intercept sensitive actions. For example, if a model suggests using the delete_production_db tool, the MCP Host can display a confirmation dialog to the user before proceeding.

Data Sovereignty

With MCP, your data stays in your infrastructure. You don't have to upload your entire database schema or sensitive logs to a model provider's cloud. You only send the specific, minimal context required for a single turn of reasoning. This is a game-changer for industries like FinTech or HealthTech, where data residency and privacy are non-negotiable.


12. Conclusion: Building the "Agentic OS"

We are witnessing the birth of a new tech stack. If the LLM is the processor and MCP is the data bus, the "Agentic OS" is the software that orchestrates these pieces to solve real-world problems.

As Staff Engineers, we have the opportunity to define the standards of this new era. It's not just about how smart the model is, but how well it can interact with the ecosystem we've built over decades.

MCP gives us the common language. The "USB of AI" is already here. It's time to start connecting our systems.

Final Thoughts

The fragmentation of the AI landscape was a necessary phase of rapid innovation. But for AI to become a truly integrated part of our engineering workflows, we need stabilization. We need protocols. We need MCP.

Stop building silos. Start building servers. The future of software is agentic, and it’s connected via MCP.


Technical Appendix: Troubleshooting your MCP Server

If your server isn't appearing in your host, check the following:

  1. Pathing: Ensure the command to start your server is absolute in your config file.
  2. Environment Variables: Stdio servers inherit the environment of the host. Make sure your PATH and API keys are accessible.
  3. JSON-RPC Validity: Use a tool like mcp-inspector to verify your server's responses.

More from this blog

Antony Giomar

21 posts