Portkey AI Agent Gateway
Govern every agent action – not just the output
Overview:
Portkey AI Agent Gateway provides full infrastructure-layer support for agentic AI workflows. As AI agents take multi-step actions with real-world consequences, Portkey delivers the visibility, guardrails, and governance to run them safely and reliably at production scale.
The controlling element of the Portkey Agent Gateway is a unified AI gateway that sits between your agents and every LLM, tool, and MCP server they interact with. The gateway natively traces all traffic – including tool calls, sub-agent invocations, and LLM completions – and ties that activity to a specific agent identity, regardless of the framework or orchestration layer used.
- Full multi-step agent tracing with end-to-end timeline visibility.
- Tool call logging with inputs, outputs, latency, and error detail – no manual instrumentation required.
- Per-agent virtual keys with scoped model and tool access.
- Model Context Protocol (MCP) governance – trace, guard, and control every MCP server call.
- Input and output guardrails enforced inline, stopping PII leakage and policy violations mid-flight.
- Cost and token tracking broken down by agent identity with budget alert thresholds.
- Compatible with all major agent frameworks: LangChain, LlamaIndex, CrewAI, AutoGen, and more.
- Works with any LLM provider: OpenAI, Anthropic, Google, Mistral, and 200+ models.
Multi-Step Agent Tracing
Trace every step of an agentic workflow – tool calls, LLM calls, and sub-agent invocations – in a unified timeline view linked to cost and latency. Portkey captures the full execution graph of your agent runs so you can debug failures, measure performance, and audit behaviour end-to-end.
- Full step-by-step execution timeline.
- Tool and LLM call traces with linked parent context.
- Sub-agent and nested workflow tracing.
- Cost and token usage linked to each trace step.
Tool Call Logging
Log every tool invocation with server name, tool name, inputs, outputs, latency, and error detail. Portkey captures tool call data automatically at the gateway layer, requiring no changes to your agent code or manual instrumentation.
- Input and output payload logging per tool call.
- Latency measurement per tool invocation.
- Error and retry tracking with full context.
- Every tool call linked to its parent agent trace.
Agent Access Control
Issue scoped virtual keys to agents and define exactly which models, tools, and APIs each agent identity is permitted to access. Revoke or rotate access instantly without redeploying your agent code.
- Per-agent virtual key issuance.
- Model and tool scoping per identity.
- Instant revocation and key rotation.
- Full usage audit log per agent identity.
Model Context Protocol (MCP) Support
Portkey provides full Model Context Protocol support. Every MCP server call your agents make passes through the gateway, where tracing, guardrails, and access controls are applied automatically without changes to your MCP server or client configuration.
- All MCP servers supported.
- Per-call tracing and structured logging.
- Guardrails applied on MCP inputs and outputs.
- Per-server access control and policy enforcement.
Agent Guardrails
Apply input and output guardrails to agentic workflows to stop PII leakage, off-topic actions, and policy violations before they cause harm. Guardrails are enforced in real time at the gateway layer – not as a post-processing step.
- PII detection and redaction on agent inputs and outputs.
- Policy violation blocking with configurable rules.
- Off-topic and harmful content filtering.
- Real-time enforcement with no added round-trip latency.
Cost and Latency Tracking Per Agent Identity
Track LLM spend, token usage, and latency broken down by agent ID. Identify runaway agents before they exceed budgets, and set per-agent thresholds with automated alerting.
- Cost and token usage breakdown per agent ID.
- Latency percentiles per agent run.
- Budget alert thresholds per identity.
- Historical trend analysis and exportable reports.
Infrastructure-Layer Governance
Prompt engineering alone cannot stop an agent from leaking PII, exceeding its budget, or taking unintended actions. Portkey enforces governance at the infrastructure layer – on every action, every time – independent of the agent framework or LLM provider in use.
Semantic Caching
Reduce redundant LLM calls across agent runs with semantic caching at the gateway. Cache responses based on semantic similarity to avoid re-querying the model for equivalent inputs, lowering cost and latency in long-running agentic workflows.
Fallback and Retry Policies
Define automatic fallback routing and retry policies at the gateway level. If a primary LLM provider returns an error or exceeds latency thresholds, Portkey automatically routes to a configured fallback provider or model without interrupting the agent run.
Load Balancing Across Providers
Distribute agent LLM traffic across multiple providers and model instances to maximise throughput and avoid rate-limit bottlenecks. Load balancing is configured in the gateway and requires no changes to agent code.
Compliance and Audit Logging
All agent interactions are logged with full fidelity for compliance and audit purposes. Logs include agent identity, timestamps, request and response payloads, tool calls, guardrail decisions, and cost attribution – exportable to your SIEM or data warehouse.
Portkey Agent Gateway Specifications:
Table 1. Agent Gateway Performance and Capacities |
||
|---|---|---|
| Cloud (Managed) | Self-Hosted (Enterprise) | |
| Request throughput | Up to 10,000 req/min | Unlimited (hardware-dependent) |
| Supported LLM providers | 200+ models across OpenAI, Anthropic, Google, Mistral, Cohere, Azure, AWS Bedrock, and more | |
| Agent framework support | LangChain, LlamaIndex, CrewAI, AutoGen, Vercel AI SDK, and all OpenAI-compatible frameworks | |
| MCP server support | All MCP-compliant servers (SSE and stdio transports) | |
| Trace retention | 30 days (extendable) | Configurable (your storage) |
| Log export | Webhook, S3, BigQuery, Datadog, Grafana, SIEM integrations | |
| Guardrail rule types | PII detection, topic filtering, content moderation, custom regex, webhook-based rules | |
| High availability | Multi-region, 99.99% SLA | Active/active cluster support |
| Table 2. Integration and Compatibility |
|---|
| SDKs |
| Python and JavaScript/TypeScript SDKs. OpenAI SDK drop-in compatibility – change one line to route through Portkey. |
| Authentication |
| API key authentication with virtual key scoping. SSO and SAML support on Enterprise tier. |
| Deployment Options |
| Managed cloud (US, EU regions), self-hosted on Kubernetes, and private cloud (VPC). Docker images available. |
| Observability Integrations |
| Native integrations with Datadog, Grafana, OpenTelemetry, Langfuse, and LangSmith. |
| Compliance |
| SOC 2 Type II, GDPR compliant. Zero data retention (ZDR) option available. |
| Table 3. Guardrail and Policy Capabilities |
|---|
| PII Detection |
| Detects and redacts names, email addresses, phone numbers, credit card numbers, SSNs, and custom entity types on both inputs and outputs. |
| Content Moderation |
| Inline content moderation using configurable classifiers. Block or flag responses before they reach the agent or end user. |
| Custom Rules |
| Regex-based rules, keyword blocklists, and webhook-based external guardrail hooks for custom policy logic. |
| Budget Controls |
| Per-agent and per-virtual-key spend limits with soft (alert) and hard (block) thresholds. Real-time cost tracking in the dashboard. |
| Rate Limiting |
| Per-agent and per-key rate limiting (requests per minute and tokens per minute) enforced at the gateway layer. |
Documentation:
View the Portkey AI Agent Gateway Documentation (External Link).
View the Portkey Agent Framework Integrations Guide (External Link).
View the Portkey Guardrails Configuration Reference (External Link).
