AI coding agents aren't chatbots that suggest code. They're processes running on your machine with your user's permissions, reading files, executing shell commands, browsing the web, and calling external APIs. AI coding agent security requires a fundamentally different mental model — you're dealing with a process that has filesystem access and takes instructions from external content.

What is the threat model for AI coding agents?

The fundamental threat: an agent that can be made to take actions the user didn't intend, using the agent's legitimate, delegated permissions. The agent itself isn't malicious — it's manipulated.

The OWASP LLM Top 10 covers the relevant vulnerability classes for LLM threat detection:

What is the AI computer use security threat?

AI computer use security is one of the least-understood areas for developers running agents with browser access. Computer use agents have an expanded attack surface: injected instructions in web content can direct the agent to take unintended UI actions.

Concrete techniques:

Agentic computer use protection requires treating web content as untrusted input that can influence tool calls. Apply pre-execution blocking to catch tool calls matching computer use attack patterns before they fire.

How does tool call interception work?

Tool call interception evaluates a tool call against a security policy before it executes — synchronously, not async. Three categories of evaluation:

Pattern-based rules

Behavioral rules

Categorical rules

How do the AI coding agent security approaches compare?

ApproachWhen it actsWhat it stops
Input guardrails (NeMo, LlamaFirewall)When content enters the modelPrompt injection reaching the LLM
Post-execution detectionAfter tool call completesAlerts on damage already done
Pre-execution blocking (Shoofly)When agent requests a tool callStops the action before it fires

See why we block instead of detect for the full argument on why detection-after-the-fact is insufficient for agentic workflows.

What are the honest trade-offs of each LLM agent security tool?

NVIDIA NeMo Guardrails

Open source, good for input/output filtering in chatbot contexts. Operates at the model boundary — does not intercept tool calls. Not designed for agentic deployments. Useful complement; not a substitute for agent-level security.

Meta LlamaFirewall

Open source, research-stage. Interesting work on prompt injection detection, but still early for production deployment. No pre-execution blocking for tool calls out of the box.

Lakera Guard

Enterprise-grade, API-based prompt injection and content filtering. Strong at the LLM input/output layer. Not agent-native — operates at the content layer, not the tool call layer. Post-ingestion, not pre-execution. Appropriate for teams with compliance requirements at the model boundary.

ClawMoat

Pre-execution blocking with a scan pipeline and credential directory monitoring. Works with OpenClaw and Claude Code. Proprietary rules — you can't audit what's being enforced. No free tier. If rule transparency and a free tier aren't requirements, it's worth evaluating.

Shoofly

Pre-execution blocking for OpenClaw and Claude Code (including Cowork and Dispatch). Open rules — you can read, edit, and audit exactly what's being enforced. Covers the full threat taxonomy above. See the OpenClaw security guide and Claude Code security guide for platform-specific details. Not enterprise-certified; designed for developers who want runtime protection without operational overhead.

What does Shoofly Advanced add for AI coding agent security?

Shoofly Basic is free — detects threats and alerts you. The threat policy is open and auditable. Shoofly Advanced upgrades to full pre-execution blocking, adds real-time alerts via Telegram and desktop notifications, and policy linting. See the Advanced docs for full configuration reference.

What's the minimum viable security posture for a developer?

  1. Keep your agent runtime updated — CVEs happen and get fixed
  2. Audit config files in repos you clone before opening them (especially for Claude Code)
  3. Apply least privilege — don't grant tool permissions the agent doesn't need
  4. Install runtime security — Shoofly Basic is free and takes five minutes
  5. For unattended agents, add alerting — you need to know when something is blocked

Add runtime security to your agent stack

Shoofly Basic is free. No API key, no account required.

Install Shoofly Basic free — runtime security for Claude Code and OpenClaw agents:

curl -fsSL https://shoofly.dev/install.sh | bash
See plans & pricing →