Skills vs CLI vs MCP: Which Do You Pick for AI Agents?
An agent is only as useful as the systems it can reach and act on. The model handles the thinking, but the connections to real systems are what turn that thinking into real outcomes. How you expose those systems is one of the most important choices you make when building an agent.
There are now three main ways to wire an agent into those systems: CLI, MCP, and Skills. Each solves a different problem, and they are often pitched as competing options. This post breaks down what each is good at, and when to layer them.
What Each One Is
CLI means giving the agent access to a shell. It runs commands like git log, gh pr create, or kubectl get pods. The agent composes commands, reads the output, and decides what to do next.
MCP (Model Context Protocol) is a standardized protocol introduced by Anthropic that connects AI models to external tools and data sources. It uses a client/server architecture over JSON-RPC. The agent discovers tools via JSON schemas at runtime. Think of it as USB-C for AI: one protocol, many tools.
Skills are bundled instructions (a SKILL.md file) that tell the agent how to complete a workflow. They use progressive disclosure: metadata loads first (~100 tokens), full instructions load only when relevant. A Skill can orchestrate both CLI commands and MCP tools underneath.
The simplest way to think about it:
- CLI = the agent's hands (local execution)
- MCP = the agent's connections (external services)
- Skills = the agent's playbook (workflows and knowledge)
Where CLI Shines
CLI tools have one massive advantage: LLMs are already fluent in the shell.
Models have been trained on billions of lines of terminal interactions from Stack Overflow, GitHub repos, tutorials, and documentation. When you give your agent shell access to git, docker, curl, or jq, you are tapping into deeply learned patterns. The model does not need a schema to know that git log --oneline -10 shows the last 10 commits. And for CLIs the model has never seen before, it can still pick them up on the fly through --help, man pages, or structured JSON output.
This shows up in real benchmarks. Scalekit ran 75 benchmark runs comparing CLI and MCP for GitHub tasks. CLI achieved 100% reliability (25/25 runs). MCP managed 72% (18/25 runs). Every MCP failure was a TCP-level timeout to the remote server. The CLI agent did not have this problem because gh runs locally.
Token costs tell a similar story. For a simple "get repo info" call, CLI cost ~1,365 tokens. MCP cost ~44,026 tokens. That is a 10 to 32x difference.
Vercel saw this firsthand. They removed 80% of their agent's custom tools and replaced them with a bash tool and a filesystem tool. Their text-to-SQL agent went from 80% to 100% success rate, using fewer steps, fewer tokens, and less time. Their sales call summarization agent dropped from ~$1.00 to ~$0.25 per call on Claude Opus.
The Unix philosophy turns out to be perfect for agents. Small tools that do one thing well, composed through pipes. LLMs are surprisingly good at this because they have seen these patterns millions of times in training data.
At the Ask 2026 conference, Perplexity CTO Denis Yarats announced the company is moving away from MCP toward APIs and CLIs, citing high context window consumption and clunky authentication as the core issues.
Where MCP Shines
CLI wins on efficiency, but MCP wins on reach.
Many services do not have CLIs. Slack, Notion, Salesforce, your internal CRM. MCP gives agents a standardized way to connect to these services without building custom integrations for each one.
MCP also handles authentication properly. When an agent acts on behalf of a user inside an organization, you need OAuth flows, scoped permissions, and audit logs. CLI was designed for a single developer on their own machine. MCP was designed for multi-tenant access.
Standardization is another strength. One MCP server works with Claude, GPT, Gemini, and any MCP-compatible client, so a single integration covers every agent your team uses. The standard itself is no longer Anthropic-owned either: in December 2025, MCP was donated to the Linux Foundation's Agentic AI Foundation, putting it under neutral governance.
The trade-off is real though. GitHub's Copilot MCP server exposes 43 tools. Every time the agent makes a tool call, the schemas for all 43 tools are part of the conversation context, even the ones it will never use. That is where the token overhead comes from.
Security is also a concern. OWASP published a dedicated MCP Top 10 covering risks like tool poisoning (malicious instructions hidden in tool descriptions), rug pulls (tools that change behavior silently after approval), and prompt injection via contextual payloads. In the MCPTox benchmark, o1-mini showed a 72.8% attack success rate against MCP tool poisoning. More capable models were often more vulnerable because the attack exploits their superior instruction-following abilities.
Where Skills Shine
Skills sit at a different layer than CLI or MCP. CLI and MCP are about access: how the agent reaches a system. Skills are about expertise: what the agent should do once it has access. A single Skill can call CLI commands, call MCP tools, or just contain instructions on its own, alongside the examples and helper scripts the agent needs to finish the job.
A CLI gives the agent a shell to run commands, whether they come from training data (git, docker, kubectl) or a newer tool the agent learns on the fly through --help, man pages, or structured JSON output (reference in the gws case study below). An MCP server exposes a list of endpoints it can call. A Skill goes further: it tells the agent the right sequence of steps to complete a task end to end. It encodes the workflow, not just the tool.
The efficiency comes from progressive disclosure. At startup, the agent only sees each skill's name and description (~100 tokens per skill). The full instructions load only when the agent determines the skill is relevant. If the skill references additional files or scripts, those load only when needed. You can ship many skills with the agent without burning through its context window.
Compare this to MCP, where tool schemas load into every message. A GitHub MCP server costs ~44,000 tokens before the agent runs a single tool call. Skills with the same coverage would cost ~100 tokens until actually invoked.
Skills are also portable. Anthropic released the SKILL.md format as an open standard in late 2025. Since then, it has been adopted by Claude Code, Cursor, Gemini CLI, Codex CLI, GitHub Copilot, Windsurf, and over a dozen other agents. Vercel launched skills.sh, the first package manager for agent skills, which hit 20,000+ installs within weeks.
The key insight: Skills can abstract over both CLI and MCP. The agent invokes a Skill, and the Skill routes to gh commands for local Git operations or MCP for Slack notifications. The agent does not need to know the transport layer. It just follows the recipe.
Case Study: Google Workspace CLI
The Google Workspace CLI (gws), released in March 2026, is a useful real-world example of how teams are choosing between these layers, and why Skills are increasingly winning.
As a CLI, gws gives agents shell access to every Google Workspace API: Drive, Gmail, Calendar, Sheets, Docs, Chat, and Admin. Every response comes back as structured JSON. It reads Google's Discovery Service at runtime, so new API endpoints appear automatically without updating the CLI.
As a Skills library, the repo ships 90+ Agent Skills with curated recipes for Gmail triage, meeting prep, weekly digests, and more. These skills do not just expose the raw API. They encode the right sequence of steps for common workflows.
What about MCP? gws originally shipped a built-in MCP server, but it was removed in v0.8.0 because exposing 200 to 400 dynamic tools blew up the context window of every MCP client that connected to it. Skills replaced it.
That tradeoff is the whole point. MCP is great for reach, but at this scale of API surface, the schema overhead becomes the bottleneck. Skills load on demand, so a hundred of them cost roughly what a handful of MCP tools cost, and the CLI underneath handles execution.
When to Use What
If you only remember three things:
- Use CLI when the agent is doing work on a developer's own machine, especially with tools developers already use (Git, AWS, Docker).
- Use MCP when the agent needs to act for many different users, or when it has to connect to apps like Slack, Notion, or Salesforce.
- Use Skills when you want to teach the agent how your team does something specific (write a QBR, triage a ticket, format a report).
Pick the one that fits the job. The table below shows which option suits each situation.
| Your situation | Best fit | Why it's the right call |
|---|---|---|
| The agent is slow or eating up the context window | Skills or CLI | MCP loads the schema (the description) for every connected tool upfront, even ones the agent never uses. Skills only load on demand (~100 tokens until invoked). CLI carries zero schema cost. |
| You're running thousands of tasks and the bill is hurting | CLI | Same task can cost 10 to 32x fewer tokens than going through MCP |
| Tool calls keep failing or timing out in production | CLI | CLI runs locally, so there's no network hop to fail. Scalekit measured 100% reliability for CLI vs 72% for MCP, with most MCP failures being TCP timeouts. |
| Your customers (or many employees) each need their own login | MCP | Built for multi-tenant access with OAuth, scoped permissions, and audit logs. CLI assumes one developer on one machine with shared credentials. |
| Connecting to apps like Slack, Notion, or Salesforce | MCP | These SaaS services don't ship CLIs, so MCP is the standard way in |
| Encoding domain knowledge that's specific to your workflow | Skills | A Skill captures the whole recipe (steps, examples, definition of done), not just the API endpoints |
| Working with tools whose APIs change every few weeks | MCP | The schema is the source of truth and updates centrally. A written-down Skill goes stale and needs manual upkeep. |
| Giving the agent access to developer tools like Git or AWS | CLI | The binary is already in PATH. The agent runs it directly, with zero integration code. |
| You want the lowest-friction setup | Skills | A Skill is a markdown file in a folder. No server to host, no authentication flow to wire up, no software to install. |
| Tasks that must execute the same way every time | MCP | Fixed schemas and deterministic API calls. Skills depend on the LLM interpreting natural-language instructions, which adds a misinterpretation failure mode. |
| Meeting enterprise authentication and audit requirements | MCP | Centralized policy, tenant isolation, and audit trails are built in. CLI's ambient credentials (whatever's saved on the machine) typically won't pass review. Note that MCP brings its own security risks on the application layer (tool poisoning, prompt injection) that need separate controls. |
The Bottom Line
There's no single winner. Each option solves a different problem, and the smart move is matching it to the constraint you actually face.
Reach for CLI when you want cheap, fast, reliable execution and the binary already exists. Reach for MCP when the agent has to act for many users, connect to SaaS apps without CLIs, or meet enterprise authentication and audit requirements. Reach for Skills when you want to encode how your team works without bloating the context window.
The lesson is not "always layer all three." It is "pick the layer the constraint demands, and do not pay for layers you do not need."
Enjoyed this post?
If this brought you value, consider buying me a coffee. It helps me keep writing.