Portkey Guardrails
Govern AI output with real-time input/output filtering – PII detection, content moderation, and custom logic enforced at the gateway layer
Overview:
Portkey Guardrails is an enterprise-grade safety and governance layer that sits inside the AI Gateway and enforces input/output policies on every LLM interaction. Rather than bolting safety logic onto individual applications, Guardrails applies consistent rules at the infrastructure level – so every team, model, and workload is protected automatically.
Guardrails runs checks on both the prompt going in and the model response coming out. When a rule fires, the request is blocked or redacted before it reaches external providers or end users, and every event is logged with full context for compliance reporting and incident review.
- Real-time PII detection and redaction across 20+ entity types on prompts and responses.
- Content moderation with configurable topic restrictions, toxicity, and hate-speech filters.
- Custom regex patterns and keyword blocklists for domain-specific risk rules.
- Third-party safety integrations: Aporia, Pillar, Patronus AI, and more via a unified interface.
- Input and output scanning – protection applied before and after every model call.
- Full blocked-event log with per-rule analytics, exportable for compliance audits.
- Real-time alerts via dashboard, email, or webhook when a guardrail fires.
- Per-team and per-virtual-key rule configuration without touching application code.
PII Detection & Redaction
Automatically detect and block names, email addresses, credit card numbers, and other personally identifiable information in prompts and responses before they reach external providers or end users.
- 20+ PII entity types supported.
- Scanning on both prompt input and model response output.
- Auto-redaction or hard block on detection.
- Full event log per triggered rule.
Content Moderation
Apply topic restrictions, toxicity filters, and hate-speech detection at the gateway level – no model fine-tuning required. Rules apply consistently across every model and every team using the gateway.
- Topic restriction rules per virtual key or team.
- Toxicity and hate-speech detection filters.
- Configurable sensitivity thresholds per rule.
- Works across all 1,600+ supported models.
Regex & Custom Rules
Define custom regex patterns or keyword blocklists to match proprietary data formats, internal terminology, or domain-specific risk patterns that general-purpose filters may not cover.
- Custom regex pattern matching.
- Keyword blocklists per team or environment.
- Domain-specific rule sets.
- Per-team and per-virtual-key configuration.
Input & Output Checks
Run guardrails on both the prompt going in and the model response coming out – providing end-to-end protection on every interaction rather than only one direction.
- Pre-request input scanning before the call reaches the provider.
- Post-response output scanning before the result reaches the user.
- Block or redact on rule trigger.
- Full audit trail per interaction.
Real-Time Alerts & Logging
Receive instant alerts when any guardrail fires. Every blocked event is logged with full context – request, rule, entity type, and timestamp – for compliance reporting and security incident review.
- Instant trigger alerts via dashboard, email, or webhook.
- Full blocked-event log with complete request context.
- Exportable logs for compliance and audit requirements.
- Per-rule analytics and hit-rate reporting in the dashboard.
Third-Party Safety Integrations
Connect Portkey Guardrails to external safety providers via a unified interface. Mix built-in guardrails with third-party checks in the same policy, applied consistently at the gateway layer regardless of which application or model is in use.
- Aporia, Pillar, and Patronus AI natively supported.
- Unified guardrail interface for built-in and partner checks.
- Pluggable architecture for additional safety vendors.
- Single policy configuration for all safety sources.
Protection at the Infrastructure Layer
Bolting safety onto individual applications is fragile and inconsistent. Because Guardrails sits inside the AI Gateway, policies are enforced centrally on every request – regardless of which team, framework, or model is making the call – without any changes to application code.
- Consistent enforcement across all teams and all models.
- No per-application safety code required.
- PII protection before data leaves your environment.
- Content moderation without model fine-tuning or redeployment.
Per-Team and Per-Key Configuration
Assign different guardrail policies to individual virtual keys, teams, or environments. A customer-facing application can enforce strict content rules while an internal research workload runs a lighter policy – all managed from one central dashboard.
Compliance and Audit Support
Every guardrail event is captured in a structured log with full request context. Logs are exportable for GDPR compliance reviews, SOC 2 audits, and internal security investigations. Portkey Guardrails operates within Portkey's SOC 2 Type II certified infrastructure, with a zero data retention option available.
Portkey Guardrails Specifications:
Table 1. Guardrails Performance and Coverage |
||
|---|---|---|
| Cloud (Managed) | Self-Hosted (Enterprise) | |
| PII entity types | 20+ entity types including names, emails, phone numbers, credit card numbers, and national identifiers | |
| Scanning scope | Input (prompt) and output (response) scanning on every request | |
| Model coverage | All 1,600+ models supported via the AI Gateway – no per-model configuration required | |
| Third-party integrations | Aporia, Pillar, Patronus AI – additional providers via pluggable architecture | |
| Rule types | PII detection, content moderation, toxicity/hate-speech, regex patterns, keyword blocklists, and third-party safety checks | |
| Enforcement actions | Block request, redact entity, or pass-through with log entry – configurable per rule | |
| Log retention | 30 days (extendable) | Configurable (your storage) |
| Deployment options | Managed cloud (US, EU) | Kubernetes, Docker, private VPC |
| Table 2. Integration and Compatibility |
|---|
| Gateway Integration |
| Built into the Portkey AI Gateway. Guardrails apply to all traffic routed through the gateway with no additional SDK changes required. |
| Authentication |
| Guardrail policies scoped per virtual key and per team. SSO and SAML support on Enterprise tier. |
| Alerting |
| Real-time trigger alerts via dashboard notification, email, and outbound webhook. |
| Observability |
| Full blocked-event log with 40+ metadata fields. Native integrations with Datadog, Grafana, and OpenTelemetry. |
| Compliance |
| SOC 2 Type II, GDPR compliant. Zero data retention (ZDR) option available. |
| Table 3. Rule and Policy Capabilities |
|---|
| PII Detection |
| 20+ entity types. Prompt and response scanning. Auto-redaction or hard block. Full event log per trigger. |
| Content Moderation |
| Topic restriction rules, toxicity and hate-speech filters, configurable sensitivity thresholds. Works across all models without fine-tuning. |
| Custom Rules |
| Regex pattern matching, keyword blocklists, and domain-specific rule sets configurable per team or virtual key. |
| Third-Party Checks |
| Aporia, Pillar, Patronus AI and additional providers via pluggable architecture. Mix built-in and partner checks in a single policy. |
| Audit & Export |
| Structured blocked-event log exportable for GDPR compliance reviews, SOC 2 audits, and security investigations. |
