MCP, Skills, and Friends
MCP servers, skills, hooks, and subagents — the tools that extend an agent harness beyond hardcoded tool schemas.
As stated earlier, there were some issues with the approach above (injecting tool schemas into request).
- Tool schema bloat: stuffing dozens of tool definitions into every request is expensive and eats context. It also can cause agent confusion
- Integration complexity: what if your tool requires complex adaptors for real systems (databases, APIs, auth, caching). ie. How do you pass auth credentials?
- Workflow reuse: what if you want a reusable playbook ("anytime you mention time off, check the latest policy for the user's country") instead of a tool? This info is only relevant in specific circumstances; in all other requests, these details unnecessarily bloat the context
Several new approaches were introduced to combat some of these shortcomings
Model Context Protocol (MCP)
Tools with state and secrets access; managed by harness, not hardcoded into API call
A server exists that exposes a list of functions (tools) it has. You can create a server yourself, or use one provided by a 3rd party provider. Its a server because:
- it responds to requests like "what tools do you have?"
- can execute functions independently (another process/server)
- it can be stateful (connections, caches, indexes, auth sessions)
You tell the harness about these servers. The harness will then send tools/list (before LLM call) and tools/call (following LLM response) requests to the MCP server to fetch the info it needs.
Detailed JSON-RPC
# MCP is JSON-RPC 2.0. The two key methods are:
# - tools/list (discover tools)
# - tools/call (invoke a tool)
──────────────────────────────────────────────────────────────────────────────
1) DISCOVER TOOLS: tools/list
──────────────────────────────────────────────────────────────────────────────
CLIENT → MCP SERVER (JSON-RPC request)
{
"jsonrpc": "2.0",
"id": "req_1",
"method": "tools/list",
"params": {}
}
MCP SERVER → CLIENT (JSON-RPC response)
{
"jsonrpc": "2.0",
"id": "req_1",
"result": {
"tools": [
{
"name": "search_issues",
"description": "Search issues in TrackerX",
"inputSchema": {
"type": "object",
"properties": {
"query": { "type": "string" },
"limit": { "type": "integer", "default": 10 }
},
"required": ["query"]
}
},
{
"name": "get_issue",
"description": "Fetch a single issue by id",
"inputSchema": {
"type": "object",
"properties": {
"id": { "type": "string" }
},
"required": ["id"]
}
}
]
}
}
# (Client then maps MCP inputSchema → model input_schema and includes these tools
# in the LLM request as normal tools.)
──────────────────────────────────────────────────────────────────────────────
2) CALL A TOOL: tools/call
──────────────────────────────────────────────────────────────────────────────
CLIENT → MCP SERVER (JSON-RPC request)
{
"jsonrpc": "2.0",
"id": "req_2",
"method": "tools/call",
"params": {
"name": "search_issues",
"arguments": {
"query": "auth bug",
"limit": 5
}
}
}
MCP SERVER → CLIENT (JSON-RPC response)
{
"jsonrpc": "2.0",
"id": "req_2",
"result": {
"content": [
{
"type": "text",
"text": "Found 2 issues:\n- AUTH-123: Login fails on Safari\n- AUTH-987: Token refresh loop"
}
],
"isError": false
}
}
# (Client then wraps that content into the LLM's tool_result shape and sends back
# as part of the conversation history.)
──────────────────────────────────────────────────────────────────────────────
3) (OPTIONAL) ERROR EXAMPLE
──────────────────────────────────────────────────────────────────────────────
CLIENT → MCP SERVER
{
"jsonrpc": "2.0",
"id": "req_3",
"method": "tools/call",
"params": {
"name": "get_issue",
"arguments": { "id": "" }
}
}
MCP SERVER → CLIENT
{
"jsonrpc": "2.0",
"id": "req_3",
"result": {
"content": [
{ "type": "text", "text": "id is required" }
],
"isError": true
}
}
Once the MCP client (Claude Code) has called the tools/list , the tools it send to the LLM contain the same info as defining tools 'normally' - they contain name, description, input schema just as normal.
"tools": [
{ "name": "foo", "description": "...", "input_schema": { ... } },
{ "name": "bar", "description": "...", "input_schema": { ... } }
]
The difference is where those tool definitions live:
- Without MCP: you (the harness author) hardcode tool schemas into every API request.
- With MCP: Claude Code fetches tool schemas from the server (once / on change), and then decides how much of that to load into the model context.
Does your server have to be 'running' in the background? No. While there is a protocol for , you can also have a stdio server that is callable. See below for example.
A Python `stdio` server
# pseudocode: MCP stdio server (JSON-RPC 2.0)
# - reads one JSON object per line from stdin
# - writes one JSON object per line to stdout
# - exposes tools/list and tools/call
import sys
import json
import os
# ---------------------------
# helpers
# ---------------------------
def read_json_line():
line = sys.stdin.readline()
if not line:
return None
return json.loads(line)
def write_json(obj):
sys.stdout.write(json.dumps(obj) + "\n")
sys.stdout.flush()
def reply_ok(req_id, result_obj):
write_json({
"jsonrpc": "2.0",
"id": req_id,
"result": result_obj
})
def reply_err(req_id, code, message, data=None):
err = {"code": code, "message": message}
if data is not None:
err["data"] = data
write_json({
"jsonrpc": "2.0",
"id": req_id,
"error": err
})
# ---------------------------
# server state / secrets
# ---------------------------
DB_URL = os.getenv("DB_URL") # server-only secret
API_KEY = os.getenv("VENDOR_API_KEY") # server-only secret
db = db_connect(DB_URL) # pseudocode
http = http_client() # pseudocode
TOOLS = [
{
"name": "get_customer_by_id",
"description": "Fetch a customer record by internal id",
"inputSchema": {
"type": "object",
"properties": {"customer_id": {"type": "integer"}},
"required": ["customer_id"]
}
},
{
"name": "vendor_search",
"description": "Search vendor tickets by query",
"inputSchema": {
"type": "object",
"properties": {
"query": {"type": "string"},
"limit": {"type": "integer", "default": 5}
},
"required": ["query"]
}
}
]
# ---------------------------
# tool implementations
# ---------------------------
def tool_get_customer_by_id(args):
cid = args["customer_id"]
row = db.query_one(
"SELECT id, name, email FROM customers WHERE id = ?",
[cid]
)
return {"content": [{"type": "text", "text": json.dumps(row)}], "isError": False}
def tool_vendor_search(args):
q = args["query"]
limit = args.get("limit", 5)
resp = http.get(
"https://api.vendor.com/tickets/search",
params={"q": q, "limit": limit},
headers={"Authorization": "Bearer " + API_KEY, "Accept": "application/json"},
timeout_ms=8000
)
if resp.status != 200:
return {"content": [{"type": "text", "text": resp.body_text}], "isError": True}
return {"content": [{"type": "text", "text": json.dumps(resp.json())}], "isError": False}
TOOL_HANDLERS = {
"get_customer_by_id": tool_get_customer_by_id,
"vendor_search": tool_vendor_search,
}
# ---------------------------
# JSON-RPC loop
# ---------------------------
def main():
while True:
msg = read_json_line()
if msg is None:
break # stdin closed
req_id = msg.get("id")
method = msg.get("method")
# notifications have no id; ignore or handle separately
if req_id is None:
continue
if method == "tools/list":
reply_ok(req_id, {"tools": TOOLS})
continue
if method == "tools/call":
params = msg.get("params") or {}
name = params.get("name")
args = params.get("arguments") or {}
handler = TOOL_HANDLERS.get(name)
if not handler:
reply_ok(req_id, {"content": [{"type": "text", "text": "Unknown tool"}], "isError": True})
continue
try:
result = handler(args)
reply_ok(req_id, result)
except Exception as e:
# don't leak secrets; return a sanitized error
reply_ok(req_id, {"content": [{"type": "text", "text": "Tool execution failed"}], "isError": True})
continue
reply_err(req_id, -32601, "Method not found")
if __name__ == "__main__":
main()
You would then add the server via:
claude mcp add \
--transport stdio \
--env DB_URL=$DB_URL \
--env VENDOR_API_KEY=$VENDOR_API_KEY \
myserver -- python mcp_server.py
Notice how we set secrets and credentials when adding the server definition. These are read by the machine running the server, not the LLM.
Further examples of customer servers that use these credentials below
Sample DB Auth & 3rd Party API keys
# PSEUDOCODE: MCP SERVER (stdio JSON-RPC) that:
- authenticates to a DATABASE (server holds creds)
- exposes safe tools (no raw SQL from the model)
- calls a 3rd-party API using an API key (server holds key)
Notes:
- In MCP, the MODEL never receives your DB password / API key.
- The MCP server runs with secrets (env/config) and returns only results.
──────────────────────────────────────────────────────────────────────────────
A) CUSTOM MCP SERVER WITH DATABASE AUTH
──────────────────────────────────────────────────────────────────────────────
# startup
DB_HOST = env("DB_HOST")
DB_PORT = env("DB_PORT")
DB_NAME = env("DB_NAME")
DB_USER = env("DB_USER")
DB_PASS = env("DB_PASS")
db = db_connect(
host=DB_HOST, port=DB_PORT, database=DB_NAME,
user=DB_USER, password=DB_PASS,
pool_size=10
)
TOOLS = [
{
"name": "get_customer_by_id",
"description": "Fetch a customer record by internal id",
"inputSchema": {
"type": "object",
"properties": {"customer_id": {"type": "integer"}},
"required": ["customer_id"]
}
},
{
"name": "search_orders",
"description": "Search orders by email (exact match) with limit",
"inputSchema": {
"type": "object",
"properties": {
"email": {"type": "string"},
"limit": {"type": "integer", "default": 20}
},
"required": ["email"]
}
}
]
# JSON-RPC loop (stdio)
while msg := read_json_line(stdin):
if msg.method == "tools/list":
reply(msg.id, { "tools": TOOLS })
elif msg.method == "tools/call":
tool = msg.params.name
args = msg.params.arguments
if tool == "get_customer_by_id":
# IMPORTANT: parameterized query, fixed SQL (no model SQL)
row = db.query_one(
"SELECT id, name, email, created_at FROM customers WHERE id = ?",
[args.customer_id]
)
reply(msg.id, {
"content": [{ "type": "text", "text": json(row) }],
"isError": false
})
elif tool == "search_orders":
rows = db.query_all(
"SELECT id, total, status, created_at FROM orders WHERE email = ? ORDER BY created_at DESC LIMIT ?",
[args.email, args.limit]
)
reply(msg.id, {
"content": [{ "type": "text", "text": json(rows) }],
"isError": false
})
else:
reply(msg.id, {
"content": [{ "type": "text", "text": "Unknown tool" }],
"isError": true
})
──────────────────────────────────────────────────────────────────────────────
B) MCP SERVER CALLING A 3RD-PARTY SERVICE WITH AN API KEY
──────────────────────────────────────────────────────────────────────────────
API_BASE = "https://api.vendor.com"
API_KEY = env("VENDOR_API_KEY") # stored only on server
# or env("VENDOR_API_KEY") could be injected by Claude Code config
TOOLS += [
{
"name": "vendor_search",
"description": "Search Vendor tickets by query",
"inputSchema": {
"type": "object",
"properties": {
"query": {"type": "string"},
"limit": {"type": "integer", "default": 5}
},
"required": ["query"]
}
}
]
# inside tools/call handler
if tool == "vendor_search":
q = args.query
limit = args.limit
resp = http_get(
url = API_BASE + "/tickets/search",
params = { "q": q, "limit": limit },
headers = {
"Authorization": "Bearer " + API_KEY, # or "x-api-key": API_KEY
"Accept": "application/json"
},
timeout_ms = 8000
)
if resp.status != 200:
reply(msg.id, { "content":[{"type":"text","text": resp.body_text}], "isError": true })
else:
data = resp.json()
reply(msg.id, { "content":[{"type":"text","text": json(data)}], "isError": false })
──────────────────────────────────────────────────────────────────────────────
C) HOW SECRETS GET INTO THE SERVER (Claude Code side)
──────────────────────────────────────────────────────────────────────────────
# Local stdio server: pass secrets via env
claude mcp add --transport stdio --env DB_USER=... --env DB_PASS=... --env VENDOR_API_KEY=... myserver -- python mcp_server.py
# Remote HTTP server: pass API key via headers (client → server)
claude mcp add --transport http myremote https://mcp.vendor.com
# and configure headers (conceptually):
headers = { "Authorization": "Bearer <token>" }
# Either way:
# - Claude model sees tool schemas + names
# - Server sees secrets
# - Results returned are data-only (never echo secrets)
What the heck is stdio?
Notice above we set the transport arg to stdio. It is one of two options:
stdio: server runs locally as a long-lived subprocess; JSON-RPC over stdin/stdout. Best for local/community wrappers + keeping secrets on your machine. State lives for the session (caches, open DB conns, in-memory indexes). Common pattern:
npx …/python …stdio servers you run yourself.http: server runs remotely as a service; JSON-RPC over HTTP. Best for shared/team services + centralized integrations. State can be durable and shared across clients (memory/Redis/DB), auth via headers/OAuth/tokens. Vendor-hosted MCP servers are often HTTP (they run the service for you), although you could run your own local server on another port.
When MCP tool lists get too big: Tool Search
If you attach a bunch of MCP servers, you're back to the original problem: "dozens/hundreds of tool descriptions sitting in context, even when idle".
Harnesses deal with this by deferring MCP tools and loading them on-demand via tool search once they'd consume "too much" of the context window. The default auto-trigger is when MCP tool descriptions exceed 10% of your context window, and it's configurable (e.g. in Claude Code, ENABLE_TOOL_SEARCH=auto:<N>).
Skills
A repeatable playbook read only when necessary; conditional prompt injection that may leverage tools
What if you want a repeatable playbook without paying the cost of carrying it around in every prompt?
Let's keep with the example we referenced above - the user has inquired about time off eligibility. With a skill, you can inject relevant context depending on the request. So we can pass the model info about the policy, procedures, etc. without having done so at the start when we didn't know the user's intent.
Sample Time Off `SKILL.md`
---
name: time-off
description: Determine time off / leave eligibility and next steps
---
# Skill: Time Off / Leave
When the user asks about time off, leave, vacation policy, EI, mat leave, etc:
1) Identify jurisdiction and employment type
- Country/province/state
- employee vs contractor
- union / public sector if relevant
- start date / tenure, if relevant
2) Ask only the minimum clarifying questions needed
- If jurisdiction is unknown: ask for it first
- If employer policy vs legal minimum is ambiguous: ask which they mean
3) Separate sources explicitly
- Legal minimums (statutory)
- Employer policy (contract/handbook)
- Practical advice (how people usually handle it)
4) Output format
- Short answer (1–2 sentences)
- Then bullets: "What's legally true", "What's policy-dependent", "What I'd do next"
- If high uncertainty: say exactly what detail is missing
5) Tool use guidance
- If there's a company handbook file locally, read it first
- Otherwise, use web search for statutory rules for the stated jurisdiction
- Do not speculate when dates/thresholds matter; ask or search
Notice the name and description frontmatter keys in the SKILL.md above. These are the only things the harness loads into its context when starting up.
Skills add structure and restraint — and are referenced only when that topic comes up.
Structure of a Skill
Can be a single SKILL.md file on its own, or it can have many elements.
my-skill/
├── SKILL.md # required entrypoint
├── template.md # optional: a fill-in template (can be any filename)
├── examples/ # optional: sample outputs / expected format
│ └── sample.md
├── reference.md # optional: deeper docs / API notes (any filename)
└── scripts/ # optional: executable helpers (bash/python/etc.)
└── validate.sh
Other than the SKILL.md, nothing else is required. In fact, none of the file and folder names matter either. This is just the convention for organization. The key is to mention (when and how) these other files and scripts in the SKILL.md so the skill reliably uses them.
Skill Nuances
Unlike tools, including MCP tools, subagents do not inherit skills from the parent convo by default. You must explicitly configure a subagent to have access to these
Skills are not "stateful" either, which makes sense seeing as it injects context depending on the request; it's as stateful as any other message sent to the LLM in the conversation.
Finally, skills are suggestions. There is no way to actually enforce a model uses a skill. Note that when using skills, CLAUDE.md becomes information about information — tools, skills, and essential info it needs on every request.
Hooks are enforcement. Skills are persuasion.
Skills can suggest bash. Tools can run bash. Hooks can enforce bash safety
Skills in Practice
As I was writing this, an interesting post discussed skills in practice, where a common theme was skills are not used unless explicitly asked. Some notable comments:
found success in treating skills more like re-usable semi-deterministic functions and less like fingers-crossed prompts for random edge-cases
and Soerensen
The observation about agents not using skills without being explicitly asked resonates. In practice, I've found success treating skills as explicit "workflows" rather than background context.
The pattern that works: skills that represent complete, self-contained sequences - "do X, then Y, then Z, then verify" - with clear trigger conditions. The agent recognizes these as distinct modes of operation rather than optional reference material.
What doesn't work: skills as general guidelines or "best practices" documents. These get lost in context or ignored entirely because the agent has no clear signal for when to apply them.
The mental model shift: think of skills less like documentation and more like subroutines you'd explicitly invoke. If you wouldn't write a function for it, it probably shouldn't be a skill.
Even Vercel thinks that skills have their place - they work better for vertical, action-specific workflows that users explicitly trigger. Vercel and other have found adding a skill's instructions in the general AGENTS.md works better.
You can also add a tool call (or even a pre-model to select relevant skills) to load a skill if you find it's not being used.
Subagents
A 'clean' call to an LLM, where (presumably) only a subset of the conversation context is passed in order to achieve some specific goal.
Mentioned earlier - in Claude Code, the tool is called Task. When the harness sees it, it runs a subagent. You rely on the parent model to inject all of the relevant context into the subagent. This seems like it saves tokens in an agentic loop, but there are some issues I have found with them that are articulated nicely in this blog by Mario Zechner.
The first issue is they are black boxes - you really can't see what the model passed to them and what is happening within (context transfer between agents is usually poor). Mario also states that using a sub-agent mid-session for context gathering is a sign you didn't plan ahead. Instead, Mario argues you should create an artefact that the single agent can use ahead of time, since agents are bad at knowing which context is relevant to send to other agents (here is Mario arguing using subagents to implement various features in parallel is an anti-pattern. [^1]
There are exceptions - something like 'code review' subagent genuinely has its merits.
[^1]: Note that this could all change - the blog I mentioned is ancient (Nov 2025) and there are rumblings that swarm workflows will be the big thing in 2026, which would suggest context transfer between agents will be fixed. Mario's stance: "Spawning multiple sub-agents to implement various features in parallel is an anti-pattern in my book and doesn't work."
Hooks
Note: below does not apply to harnesses like codex that do not support hooks Also tools like OpenCode use different event names, but the principle still applies
Things to run when a harness event is triggered (and in CC, everything is an event)
The list of events are here. All we need to remember is that whenever the harness (CC) action, it has an event type for which we can hook some action onto
| Event | Description | Matcher Support |
|---|---|---|
PreToolUse |
Runs before a tool call is executed | Yes (tool name) |
PostToolUse |
Runs after a tool completes successfully | Yes (tool name) |
PostToolUseFailure |
Runs after a tool call fails | Yes (tool name) |
PermissionRequest |
Runs when a permission dialog is about to be shown | Yes (tool name) |
UserPromptSubmit |
Runs when the user submits a prompt, before Claude processes it | No |
Stop |
Runs when the main Claude Code agent finishes responding | No |
SubagentStop |
Runs when a subagent (Task tool) finishes responding | No |
Notification |
Runs when Claude Code sends a notification | Yes (notification type) |
PreCompact |
Runs before a compact operation | Yes (manual or auto) |
SessionStart |
Runs when a new session starts or an existing one resumes | Yes (startup, resume, clear, compact) |
SessionEnd |
Runs when a session ends | No |
There are 'matchers' for which you can filter the hook even more, so that not every PreToolUse event runs the hook, but rather only ones where the tool is Bash, Write, Edit, for example.
Some common PreToolUse / PostToolUse matchers
Bash— Shell commandsRead— File readingWrite— File writingEdit— File editingMultiEdit— Multi-file editingGlob— File pattern matchingGrep— Content searchTask— Subagent tasksWebFetch/WebSearch— Web operationsNotebook.*— Notebook operations (regex)
And some Notification matchers:
permission_prompt— Permission requestsidle_prompt— Waiting for user input (60+ seconds idle)auth_success— Authentication successelicitation_dialog— MCP tool elicitation input needed
Docs for CC hooks here
Honestly, other than understanding the events and that you can do 'anything' in response to an event, I just ask the LLM itself to create a hook. This seems to me to be the one differentiator for CC vs other tools - its event system.
Sample Hooks
1. UserPromptSubmit — force skill activation
{
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "\"$CLAUDE_PROJECT_DIR\"/.claude/hooks/skill-router.sh"
}
]
}
]
}
}
#!/bin/bash
INPUT=$(cat)
PROMPT=$(echo "$INPUT" | jq -r '.prompt')
if echo "$PROMPT" | grep -qiE '(test|spec|coverage)'; then
SKILL="testing-patterns"
elif echo "$PROMPT" | grep -qiE '(api|endpoint|route)'; then
SKILL="backend-guidelines"
else
exit 0
fi
echo "MANDATORY: Use Skill($SKILL) BEFORE responding. Do NOT skip this step."
exit 0
Notice how echo is used to print text to the terminal, which Claude sees as if someone typed it into the conversation
2. PreToolUse — block dangerous commands
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "jq -r '.tool_input.command' | grep -qiE 'rm -rf|drop table|force push' && echo 'Blocked: dangerous command' >&2 && exit 2 || exit 0"
}
]
}
]
}
}
3. SessionStart — inject project context
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "command",
"command": "echo \"Recent changes:\n$(cd \"$CLAUDE_PROJECT_DIR\" && git log --oneline -5 2>/dev/null)\""
}
]
}
]
}
}
4. Notification — play a sound when Claude needs input
{
"hooks": {
"Notification": [
{
"matcher": "idle_prompt",
"hooks": [
{
"type": "command",
"command": "afplay /System/Library/Sounds/Glass.aiff"
}
]
}
]
}
}
Skills vs Hooks vs MCP vs Subagents
Must it run regardless of what the model wants?
│
┌─────────┴─────────┐
YES NO
│ │
▼ ▼
HOOKS Needs state, secrets,
(enforce, block, or external connections?
log, validate) │
¹ ┌───────┴───────┐
YES NO
│ │
▼ ▼
MCP SERVER ¹ Reusable across
(remote or requests?
local+env) │
┌───────┴───────┐
YES NO
│ │
▼ ▼
Reducible to a SUBAGENT
function call? (isolated task,
│ fresh context)
┌────────┴────────┐
YES NO
│ │
▼ ▼
MCP SERVER ¹ SKILL
(local stdio, (playbook /
callable recipe /
function) procedure) ¹,²
¹ These compose: a Hook can block an MCP call
(PreToolUse), a Hook can force a Skill to load
(UserPromptSubmit → skill-router.sh), and a
Skill can suggest calling an MCP tool.
They layer rather than compete.
² Skills can contain helper scripts (e.g. scripts/validate.sh)
but these aren't standalone tools — the model runs them
through bash as part of following the skill's instructions.
If the script could stand alone as input → output with its
own schema, it probably belongs as an MCP tool instead.