Perplexity has published a comprehensive security framework for AI agents in response to NIST's Request for Information 2025-0035, drawing on the company's experience operating agentic systems used by millions of users and thousands of enterprises. The paper, authored by Ninghui Li, Kaiyuan Zhang, Kyle Polley, and Jerry Ma, identifies critical vulnerabilities in agent architectures and proposes five defense layers to address them.
Agent Architectures Create New Security Challenges
The framework highlights how agent architectures fundamentally change core assumptions around code-data separation, authority boundaries, and execution predictability. These changes create new confidentiality, integrity, and availability failure modes that traditional security models don't adequately address.
The paper maps three principal attack surfaces:
- Tools and connectors that agents interact with
- Hosting boundaries where agents execute
- Multi-agent coordination vulnerabilities
Critical Vulnerabilities Identified
Perplexity's research identifies four major vulnerability categories affecting production AI agents:
- Indirect prompt injection: Attackers exploit agent tool interactions to manipulate behavior without directly accessing the agent
- Confused-deputy problems: Agents misuse granted permissions, causing unintended actions that violate expected authority boundaries
- Cascading failures: Errors in long-running workflows propagate across systems, creating compound security risks
- Compromised code-data separation: Agents blur traditional boundaries between executable code and data, making it difficult to apply conventional security controls
Five-Layer Defense Strategy Proposed
The framework recommends a multi-layered defense approach:
- Input & model defenses: Filter malicious prompts and improve model robustness against adversarial inputs
- Sandboxed execution: Isolate agent actions from critical systems to limit blast radius
- Deterministic policies: Enforce hard rules for high-consequence operations that override agent decision-making
- Standards development: Create adaptive security benchmarks and multi-agent design guidance
- Privilege control frameworks: Establish formal models for safe delegation and authority boundaries
Standards Gaps Require Industry Collaboration
The paper identifies critical gaps in current security standards for AI agents. Perplexity calls for adaptive security benchmarks, policy models for delegation and privilege control, and guidance for secure multi-agent system design aligned with NIST risk management principles. The framework is designed to inform future standards development as agent deployments scale across enterprise and government environments.
Key Takeaways
- Perplexity published a security framework for AI agents based on production experience with millions of users and thousands of enterprises
- Agent architectures create four critical vulnerability categories: indirect prompt injection, confused-deputy problems, cascading failures, and compromised code-data separation
- The framework proposes five defense layers including sandboxed execution, deterministic policies, and privilege control frameworks
- Three principal attack surfaces were mapped: tools/connectors, hosting boundaries, and multi-agent coordination
- The paper identifies gaps in current standards and calls for adaptive security benchmarks aligned with NIST risk management principles