Portkey Prompt Engineering
Version, test, and deploy prompts from a centralised library – iterate without touching your codebase or redeploying your application
Overview:
Portkey Prompt Engineering is a centralised prompt management platform that decouples prompt iteration from the application release cycle. When prompts live in code, every change requires a deployment. Portkey's Prompt Library moves prompt development into a shared, versioned workspace so teams can experiment, test, and improve continuously – without touching the codebase.
The library integrates directly with the Portkey AI Gateway, meaning every prompt is subject to the same routing policies, caching rules, and observability as any other LLM call. Engineers, product managers, and data scientists can all collaborate on prompt development from one interface, with role-based access and full audit history.
- Full version history with one-click rollback and diff view between any two versions.
- A/B testing – split traffic between prompt versions or models and compare quality metrics in real time.
- Template variables for runtime personalisation without string interpolation in application code.
- Collaborative library with role-based access, comments, and project-level organisation.
- Performance analytics per prompt version – quality score, latency, and cost tracked side by side.
- Native integration with LangChain, LlamaIndex, and all OpenAI-compatible frameworks.
- Iterate on prompts and promote winners without a code deployment or application restart.
- Linked to the eval pipeline for structured quality measurement on every version.
Version Control
Track every change to every prompt with a full version history. Roll back to any prior version instantly with a diff view between any two revisions and complete author and timestamp attribution.
- Full version history retained indefinitely.
- One-click rollback to any prior version.
- Diff view between any two versions.
- Author and timestamp tracking on every change.
A/B Testing
Split traffic between prompt versions or models and compare quality metrics in real time. Promote the winning configuration with a single click – no code changes, no redeployment required.
- Traffic splitting by configurable percentage.
- Real-time quality, latency, and cost comparison.
- Promote winner with one click.
- Test prompt versions and model swaps simultaneously.
Template Variables
Use named dynamic variables in prompts to personalise at runtime without managing prompt string interpolation in application code. Preview prompts with test values before publishing.
- Named template variables with runtime injection.
- Preview with test values before deployment.
- Variables reusable across multiple prompts.
- No string interpolation required in application code.
Performance Analytics
Track response quality, latency, and cost per prompt version side by side. Historical trend views and integration with the eval pipeline enable data-driven decisions on what to deploy to production.
- Quality score per prompt version.
- Cost and latency tracked per prompt call.
- Historical trend view across versions.
- Linked to the Portkey eval pipeline.
Framework Compatibility
Works natively with LangChain, LlamaIndex, and any OpenAI-compatible client. Prompts are fetched from the library at runtime via the Portkey SDK or REST API with no framework-specific integration work required.
- LangChain and LlamaIndex native support.
- OpenAI-compatible drop-in.
- REST API for custom clients.
- Python and JavaScript/TypeScript SDK support.
Collaborative Prompt Library
A shared workspace where engineers, product managers, and data scientists can collaborate on prompt development without merge conflicts or deployment dependencies. Prompts are organised by project, with role-based access controlling who can view, edit, or publish each entry.
- Role-based access control per team and project.
- Comments and annotations on individual prompt versions.
- Shared workspace accessible across functions without engineering bottlenecks.
- Project-level organisation for multi-product environments.
Decouple Prompts from the Release Cycle
When prompts live in code, every improvement requires an engineering deployment. Portkey's Prompt Library separates prompt iteration from the release cycle entirely – teams can experiment, roll back, and promote prompts in production without waiting for a deployment window.
- Iterate prompts without a code deployment.
- Promote or roll back any version instantly.
- No application restart required on prompt change.
- Non-engineering team members can iterate independently.
Playground and Pre-Deployment Testing
Test any prompt directly in the Portkey playground before publishing. Run against multiple models, inject template variables with test values, and review the full response alongside cost and latency metrics – all before the change reaches production traffic.
Audit and Governance
Every prompt change is tracked with author attribution, timestamp, and a full diff. Combined with the Portkey AI Gateway's request logs, teams have a complete audit trail from prompt version to individual LLM call for compliance and incident review.
Portkey Prompt Engineering Specifications:
Table 1. Prompt Library Performance and Capacities |
||
|---|---|---|
| Cloud (Managed) | Self-Hosted (Enterprise) | |
| Version history | Retained indefinitely | Retained indefinitely (your storage) |
| A/B testing | Traffic split by configurable percentage across prompt versions and models | |
| Template variables | Named runtime variables with preview, test injection, and reuse across prompts | |
| Model compatibility | All 1,600+ models supported via the Portkey AI Gateway | |
| Access control | Role-based per team and project | Role-based + SSO/SAML on Enterprise tier |
| Deployment options | Managed cloud (US, EU) | Kubernetes, Docker, private VPC |
| Log retention | 30 days (extendable) | Configurable (your storage) |
| API compatibility | OpenAI-compatible. Python and JS/TS SDKs. REST API for custom clients. | |
| Table 2. Integration and Compatibility |
|---|
| SDKs |
| Python and JavaScript/TypeScript SDKs. Full OpenAI SDK compatibility – prompts fetched from the library at runtime via a two-line change. |
| Agent Frameworks |
| LangChain, LlamaIndex, CrewAI, AutoGen, Vercel AI SDK, and all OpenAI-compatible frameworks. |
| Observability |
| Per-prompt analytics in the dashboard. Native integrations with Datadog, Grafana, Langfuse, and OpenTelemetry. |
| Eval Pipeline |
| Linked to the Portkey eval pipeline for structured quality measurement and comparison across prompt versions. |
| Compliance |
| SOC 2 Type II, GDPR compliant. Zero data retention (ZDR) option available. |
| Table 3. Versioning and Collaboration Capabilities |
|---|
| Version Control |
| Full history with diff view, one-click rollback, author attribution, and timestamp tracking on every change. |
| A/B Testing |
| Configurable traffic splits across prompt versions and models. Real-time quality, latency, and cost comparison. One-click promotion. |
| Collaboration |
| Role-based access per team and project. Comments and annotations. Shared workspace for engineering, product, and data science. |
| Template Variables |
| Named variables with runtime injection, test-value preview, and reuse across multiple prompts without application code changes. |
| Analytics |
| Quality score, cost, and latency tracked per prompt version. Historical trend view linked to the eval pipeline. |
