MCP vs CLI vs Python Tools
When building a remote Python agent, when should you reach for MCP servers, CLI tools, or native Python functions? A practical comparison of trade-offs.
When do we use MCP vs CLI vs native Python tooling?
I am building a remote agent in Python (Pydantic AI) for a small team of 5 users. Users currently communicate with this service via email, therefore the code is running on a remote Linux VM which means I theoretically can mix and match any 'tooling' solution - CLI, MCP, and standard Python functions as tools.
I am also hyper-aware of context rot issues and how large number of tools can overwhelm models, so when I read some HN posts discussing these topics, I had to do another summary of how I understand things.
MCP
You use MCP when:
- You need dynamic tool discovery — your agent connects to systems that expose capabilities at runtime rather than build time.
- You're reusing existing MCP servers from an ecosystem
- You require auth or persistent connectors (shared connection pooling to database is a particular strength of MCP)
- You need progressive tool discovery to avoid input token tax - loading dozens of tools is harmful to model's response quality; with MCP you can provide a single tool called
search_tools, which conditionally finds the relevant tools based on the context.
CLI
- I have noticed people in early 2026 reverting to CLI instead of MCP because Unix primitives already solve composability elegantly. Community benchmarks showed CLI achieving 28% higher task completion scores with roughly the same token count, and a 33% better token efficiency score compared to MCP.
- This makes sense because the CLI is self-documenting (via
--helpandmanpages) and forces the LLM to use targeted queries (like piping output throughgreporjq) rather than relying on an MCP server to dump an entire pre-formatted JSON structure.
- This makes sense because the CLI is self-documenting (via
- As with anything, there is nuance here. CLIs spin up their own subprocess, which is memory and CPU intensive when serving more than a few users at once. Or if a tool is called very frequently, spawning many processes over and over is wasteful.
- Definitely should be used when the CLI is clearly better at its job than any Python wrapper (think
gh,ffmpeg,git,curl,docker)
Python tools
- Parallel tool execution is strength for Python tools. Because Pydantic AI uses
asyncio, executing 5 native tools in parallel simply schedules 5 microscopic coroutines on the existing event loop - essentially free. Conversely, if an agent calls 5 CLI tools at once, the Linux VM has to spin up 5 completely separate OS-level subprocesses (fork/exec), which creates severe memory and CPU overhead.- MCP adds serialization and IPC overhead on every call even when the server handles requests concurrently — it's faster than spawning subprocesses, but slower than native async Python where you skip the pipe entirely.
- Less distributed system complexity
- Lower Execution Overhead: MCP requires serializing requests into JSON-RPC and passing them over
stdio, while CLI requires the OS to spin up a new subprocess. Native Python avoids this entirely by executing in shared memory.
- Less Debugging Friction: Native Python tools give you continuous stack traces and easy IDE debugging. With custom MCP or CLI tools, you are forced to debug broken
stdiopipes, monitor divided log streams, and deal with serialization errors between the processes.
- Lower Execution Overhead: MCP requires serializing requests into JSON-RPC and passing them over
- Connection pooling is also a strength of native Python (concurrent users can share single database connection pool)
- This isn't a perfect solution by any means - for example, tool context bloat is a real concern with this approach because a) a native Python tool definition costs just as many tokens in the system prompt as an MCP tool definition does, and there is no good Python native
search_toolequivalent.- Also if you are not using Python as a backend, this is completely moot.
These days, I find myself authoring my own tools, so I control output context directly. Blogs like Stop Burning Your Context Window are interesting because their solution compresses MCP tool output by >90% - this is relevant if we are not in control of the CLI or MCP tools (ie. using a 3rd-party MCP server or CLI tool).
To answer the initial question on which approach to use, the answer is unfortunately it depends (just like life, the answer is messy and nuanced). If I am building a remote Python/Pydantic agent, setup might be something like:
- Start with only Python tools
- Partially incorporate MCP if we need
search_tools(too many tools) or need persistent state or auth - Only reach for CLI when the tool genuinely has no Python equivalent (
ffmpeg,imagemagick, system-level binaries), and even then, allowlist permitted commands and truncate output.
For local tooling in Claude Code (ie. where you control and oversee the machine executing tool calls), this may be reversed - you probably start with CLI and add 3rd party or custom MCP servers if no CLI exists. No Python option really exists here.
Notice there is no mention of Skills yet. This is for another post.
Sources: