Andrej Karpathy coined the term “vibe coding” about a year ago, and it stuck. Merriam-Webster picked it up within a month. Gartner now forecasts that 60% of new code will be AI-generated by the end of 2026. Google and Microsoft report that 30% of their new code already is.
I’m a believer. I use Claude Code every single day. I built a /implement command that enforces my TDD workflow, handles GitHub issues, PRDs, and feature requests. The time savings compound across every feature I build. After getting burned by overhyped workflows from influencers that demonstrated parlor tricks more than pragmatism, I went back to basics and implemented Anthropic’s documented best practices. The results have been real and measurable.
Vibe coding is a genuine productivity gift. I’m shipping faster, the code quality has improved, and I’m delivering more value in less time.
But the security story? The security story is a disaster.
The Current State of AI Agent Security
Let’s be honest about where we are. OpenClaw has been called a security “dumpster fire.” Agent frameworks are running arbitrary code with elevated privileges. MCP servers are being deployed without authentication. AI agents are being given access to production databases, source code repositories, and deployment pipelines with minimal access controls.
The industry is building incredible creative tools on a foundation of terrifying security assumptions.
The fundamental problem is that AI agents need capabilities to be useful, and capabilities create attack surface. An agent that can read your codebase can exfiltrate your source code. An agent that can execute shell commands can do anything your user account can do. An agent that can interact with external services through MCP can be tricked by prompt injection into calling tools it shouldn’t.
These aren’t theoretical risks. They’re documented vulnerabilities in production systems today.
How We Got Here
The people building AI agent frameworks optimized for capability, not security. That’s understandable. In the early days of any technology, you optimize for proving that it works. Security comes later.
But “later” has arrived, and the security work hasn’t kept pace.
I’ve watched this exact pattern play out in IoT. When I was building Xively, the IoT industry was going through the same phase: everyone racing to connect devices to the internet, nobody thinking about what happens when those devices get compromised. It took a series of spectacular botnet attacks (Mirai, anyone?) before the industry got serious about device security. We had to learn the hard way that “connected” and “secure” are very different properties.
AI agents are at the same inflection point. The “connect everything, secure it later” approach has created a landscape where agents with production access are running on frameworks with demo-grade security.
What a Security-First Agent Framework Looks Like
The fix here isn’t incremental. The industry needs agent frameworks that are designed around security from the ground up, not frameworks that bolt security on as a feature.
The core design principle should be simple: an agent should have the minimum capabilities necessary to complete its task, and every capability should be explicitly granted, monitored, and revocable.
In practice, this means several things.
Sandboxed execution. Agent code should run in isolation. Lightweight, disposable execution environments (Firecracker microVMs are a promising path here) where the agent has no access to the host system, no persistent state, and no ability to affect other agents. If the agent is compromised, the blast radius is contained.
Explicit capability grants. Every tool an agent can use must be explicitly granted. No default access to anything. If an agent needs to read a file, it needs a specific permission to read that specific file (or class of files). More work upfront, but it eliminates the entire category of “the agent accidentally accessed something it shouldn’t have.”
Skill-based architecture. Agent capabilities should be defined in versioned, auditable skill packages. Skills encode what an agent can do and what it can’t do. The constraints are as important as the capabilities.
Scoped delegation. When a task requires multiple agents, a parent agent should be able to spawn sub-agents with further-restricted permissions. The sub-agent can never have more capabilities than its parent, and every delegation is logged.
What the Industry Needs
These principles need to pervade the entire AI agent ecosystem, not live in a single framework.
MCP servers need authentication by default. It’s alarming how many MCP servers are deployed with no auth. Every MCP server should require authentication tied to the agent’s identity and permission scope.
Agent frameworks need capability restrictions. The default should be no access, with explicit grants for each capability. This is the opposite of how most frameworks work today, where the default is full access and restrictions are an afterthought.
Prompt injection needs to be treated as a first-class security concern. The content an agent processes can influence its behavior. This is a fundamental vulnerability, and we need defense-in-depth strategies that assume prompt injection will occur and limit the damage it can cause.
Audit trails need to be comprehensive. Every tool call, every data access, every external interaction should be logged with enough context to reconstruct the agent’s decision chain.
The Path Forward
I love vibe coding. It has made me genuinely more productive and more creative. I believe AI-assisted development is a permanent shift in how software gets built.
But we can’t build the future of software development on a security foundation this fragile. The people who take security as seriously as capability will define the next wave of AI infrastructure. The people who don’t will be the cautionary tales.
Vibe code your heart out. Just don’t vibe your security.