chore: publish from staged

2026-04-14 04:05:58 +00:00 · 2026-04-10 04:45:41 +00:00
parent 10fda505b7
commit 8395dce14c
467 changed files with 97526 additions and 276 deletions
--- a/plugins/phoenix/skills/phoenix-evals/references/production-guardrails.md
+++ b/plugins/phoenix/skills/phoenix-evals/references/production-guardrails.md
@@ -0,0 +1,53 @@
+# Production: Guardrails vs Evaluators
+
+Guardrails block in real-time. Evaluators measure asynchronously.
+
+## Key Distinction
+
+```
+Request → [INPUT GUARDRAIL] → LLM → [OUTPUT GUARDRAIL] → Response
+                                            │
+                                            └──→ ASYNC EVALUATOR (background)
+```
+
+## Guardrails
+
+| Aspect | Requirement |
+| ------ | ----------- |
+| Timing | Synchronous, blocking |
+| Latency | < 100ms |
+| Purpose | Prevent harm |
+| Type | Code-based (deterministic) |
+
+**Use for:** PII detection, prompt injection, profanity, length limits, format validation.
+
+## Evaluators
+
+| Aspect | Characteristic |
+| ------ | -------------- |
+| Timing | Async, background |
+| Latency | Can be seconds |
+| Purpose | Measure quality |
+| Type | Can use LLMs |
+
+**Use for:** Helpfulness, faithfulness, tone, completeness, citation accuracy.
+
+## Decision
+
+| Question | Answer |
+| -------- | ------ |
+| Must block harmful content? | Guardrail |
+| Measuring quality? | Evaluator |
+| Need LLM judgment? | Evaluator |
+| < 100ms required? | Guardrail |
+| False positives = angry users? | Evaluator |
+
+## LLM Guardrails: Rarely
+
+Only use LLM guardrails if:
+- Latency budget > 1s
+- Error cost >> LLM cost
+- Low volume
+- Fallback exists
+
+**Key Principle:** Guardrails prevent harm (block). Evaluators measure quality (log).