chore: publish from staged

2026-04-12 19:25:55 +00:00 · 2026-04-09 06:26:21 +00:00
parent 017f31f495
commit a68b190031
467 changed files with 97527 additions and 276 deletions
--- a/plugins/phoenix/skills/phoenix-tracing/README.md
+++ b/plugins/phoenix/skills/phoenix-tracing/README.md
@@ -0,0 +1,24 @@
+# Phoenix Tracing Skill
+
+OpenInference semantic conventions and instrumentation guides for Phoenix.
+
+## Usage
+
+Start with `SKILL.md` for the index and quick reference.
+
+## File Organization
+
+All files in flat `rules/` directory with semantic prefixes:
+
+- `span-*` - Span kinds (LLM, CHAIN, TOOL, etc.)
+- `setup-*`, `instrumentation-*` - Getting started guides
+- `fundamentals-*`, `attributes-*` - Reference docs
+- `annotations-*`, `export-*` - Advanced features
+
+## Reference
+
+- [OpenInference Spec](https://github.com/Arize-ai/openinference/tree/main/spec)
+- [Phoenix Documentation](https://docs.arize.com/phoenix)
+- [Python OTEL API](https://arize-phoenix.readthedocs.io/projects/otel/en/latest/)
+- [Python Client API](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
+- [TypeScript API](https://arize-ai.github.io/phoenix/)
--- a/plugins/phoenix/skills/phoenix-tracing/SKILL.md
+++ b/plugins/phoenix/skills/phoenix-tracing/SKILL.md
@@ -0,0 +1,139 @@
+---
+name: phoenix-tracing
+description: OpenInference semantic conventions and instrumentation for Phoenix AI observability. Use when implementing LLM tracing, creating custom spans, or deploying to production.
+license: Apache-2.0
+compatibility: Requires Phoenix server. Python skills need arize-phoenix-otel; TypeScript skills need @arizeai/phoenix-otel.
+metadata:
+  author: oss@arize.com
+  version: "1.0.0"
+  languages: "Python, TypeScript"
+---
+
+# Phoenix Tracing
+
+Comprehensive guide for instrumenting LLM applications with OpenInference tracing in Phoenix. Contains reference files covering setup, instrumentation, span types, and production deployment.
+
+## When to Apply
+
+Reference these guidelines when:
+
+- Setting up Phoenix tracing (Python or TypeScript)
+- Creating custom spans for LLM operations
+- Adding attributes following OpenInference conventions
+- Deploying tracing to production
+- Querying and analyzing trace data
+
+## Reference Categories
+
+| Priority | Category        | Description                    | Prefix                     |
+| -------- | --------------- | ------------------------------ | -------------------------- |
+| 1        | Setup           | Installation and configuration | `setup-*`                  |
+| 2        | Instrumentation | Auto and manual tracing        | `instrumentation-*`        |
+| 3        | Span Types      | 9 span kinds with attributes   | `span-*`                   |
+| 4        | Organization    | Projects and sessions          | `projects-*`, `sessions-*` |
+| 5        | Enrichment      | Custom metadata                | `metadata-*`               |
+| 6        | Production      | Batch processing, masking      | `production-*`             |
+| 7        | Feedback        | Annotations and evaluation     | `annotations-*`            |
+
+## Quick Reference
+
+### 1. Setup (START HERE)
+
+- [setup-python](references/setup-python.md) - Install arize-phoenix-otel, configure endpoint
+- [setup-typescript](references/setup-typescript.md) - Install @arizeai/phoenix-otel, configure endpoint
+
+### 2. Instrumentation
+
+- [instrumentation-auto-python](references/instrumentation-auto-python.md) - Auto-instrument OpenAI, LangChain, etc.
+- [instrumentation-auto-typescript](references/instrumentation-auto-typescript.md) - Auto-instrument supported frameworks
+- [instrumentation-manual-python](references/instrumentation-manual-python.md) - Custom spans with decorators
+- [instrumentation-manual-typescript](references/instrumentation-manual-typescript.md) - Custom spans with wrappers
+
+### 3. Span Types (with full attribute schemas)
+
+- [span-llm](references/span-llm.md) - LLM API calls (model, tokens, messages, cost)
+- [span-chain](references/span-chain.md) - Multi-step workflows and pipelines
+- [span-retriever](references/span-retriever.md) - Document retrieval (documents, scores)
+- [span-tool](references/span-tool.md) - Function/API calls (name, parameters)
+- [span-agent](references/span-agent.md) - Multi-step reasoning agents
+- [span-embedding](references/span-embedding.md) - Vector generation
+- [span-reranker](references/span-reranker.md) - Document re-ranking
+- [span-guardrail](references/span-guardrail.md) - Safety checks
+- [span-evaluator](references/span-evaluator.md) - LLM evaluation
+
+### 4. Organization
+
+- [projects-python](references/projects-python.md) / [projects-typescript](references/projects-typescript.md) - Group traces by application
+- [sessions-python](references/sessions-python.md) / [sessions-typescript](references/sessions-typescript.md) - Track conversations
+
+### 5. Enrichment
+
+- [metadata-python](references/metadata-python.md) / [metadata-typescript](references/metadata-typescript.md) - Custom attributes
+
+### 6. Production (CRITICAL)
+
+- [production-python](references/production-python.md) / [production-typescript](references/production-typescript.md) - Batch processing, PII masking
+
+### 7. Feedback
+
+- [annotations-overview](references/annotations-overview.md) - Feedback concepts
+- [annotations-python](references/annotations-python.md) / [annotations-typescript](references/annotations-typescript.md) - Add feedback to spans
+
+### Reference Files
+
+- [fundamentals-overview](references/fundamentals-overview.md) - Traces, spans, attributes basics
+- [fundamentals-required-attributes](references/fundamentals-required-attributes.md) - Required fields per span type
+- [fundamentals-universal-attributes](references/fundamentals-universal-attributes.md) - Common attributes (user.id, session.id)
+- [fundamentals-flattening](references/fundamentals-flattening.md) - JSON flattening rules
+- [attributes-messages](references/attributes-messages.md) - Chat message format
+- [attributes-metadata](references/attributes-metadata.md) - Custom metadata schema
+- [attributes-graph](references/attributes-graph.md) - Agent workflow attributes
+- [attributes-exceptions](references/attributes-exceptions.md) - Error tracking
+
+## Common Workflows
+
+- **Quick Start**: setup-{lang} → instrumentation-auto-{lang} → Check Phoenix
+- **Custom Spans**: setup-{lang} → instrumentation-manual-{lang} → span-{type}
+- **Session Tracking**: sessions-{lang} for conversation grouping patterns
+- **Production**: production-{lang} for batching, masking, and deployment
+
+## How to Use This Skill
+
+**Navigation Patterns:**
+
+```bash
+# By category prefix
+references/setup-*              # Installation and configuration
+references/instrumentation-*    # Auto and manual tracing
+references/span-*               # Span type specifications
+references/sessions-*           # Session tracking
+references/production-*         # Production deployment
+references/fundamentals-*       # Core concepts
+references/attributes-*         # Attribute specifications
+
+# By language
+references/*-python.md          # Python implementations
+references/*-typescript.md      # TypeScript implementations
+```
+
+**Reading Order:**
+1. Start with setup-{lang} for your language
+2. Choose instrumentation-auto-{lang} OR instrumentation-manual-{lang}
+3. Reference span-{type} files as needed for specific operations
+4. See fundamentals-* files for attribute specifications
+
+## References
+
+**Phoenix Documentation:**
+
+- [Phoenix Documentation](https://docs.arize.com/phoenix)
+- [OpenInference Spec](https://github.com/Arize-ai/openinference/tree/main/spec)
+
+**Python API Documentation:**
+
+- [Python OTEL Package](https://arize-phoenix.readthedocs.io/projects/otel/en/latest/) - `arize-phoenix-otel` API reference
+- [Python Client Package](https://arize-phoenix.readthedocs.io/projects/client/en/latest/) - `arize-phoenix-client` API reference
+
+**TypeScript API Documentation:**
+
+- [TypeScript Packages](https://arize-ai.github.io/phoenix/) - `@arizeai/phoenix-otel`, `@arizeai/phoenix-client`, and other TypeScript packages
--- a/plugins/phoenix/skills/phoenix-tracing/references/annotations-overview.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/annotations-overview.md
@@ -0,0 +1,69 @@
+# Annotations Overview
+
+Annotations allow you to add human or automated feedback to traces, spans, documents, and sessions. Annotations are essential for evaluation, quality assessment, and building training datasets.
+
+## Annotation Types
+
+Phoenix supports four types of annotations:
+
+| Type                    | Target                           | Purpose                                  | Example Use Case                 |
+| ----------------------- | -------------------------------- | ---------------------------------------- | -------------------------------- |
+| **Span Annotation**     | Individual span                  | Feedback on a specific operation         | "This LLM response was accurate" |
+| **Document Annotation** | Document within a RETRIEVER span | Feedback on retrieved document relevance | "This document was not helpful"  |
+| **Trace Annotation**    | Entire trace                     | Feedback on end-to-end interaction       | "User was satisfied with result" |
+| **Session Annotation**  | User session                     | Feedback on multi-turn conversation      | "Session ended successfully"     |
+
+## Annotation Fields
+
+Every annotation has these fields:
+
+### Required Fields
+
+| Field     | Type   | Description                                                                   |
+| --------- | ------ | ----------------------------------------------------------------------------- |
+| Entity ID | String | ID of the target entity (span_id, trace_id, session_id, or document_position) |
+| `name`    | String | Annotation name/label (e.g., "quality", "relevance", "helpfulness")           |
+
+### Result Fields (At Least One Required)
+
+| Field         | Type              | Description                                                       |
+| ------------- | ----------------- | ----------------------------------------------------------------- |
+| `label`       | String (optional) | Categorical value (e.g., "good", "bad", "relevant", "irrelevant") |
+| `score`       | Float (optional)  | Numeric value (typically 0-1, but can be any range)               |
+| `explanation` | String (optional) | Free-text explanation of the annotation                           |
+
+**At least one** of `label`, `score`, or `explanation` must be provided.
+
+### Optional Fields
+
+| Field            | Type   | Description                                                                             |
+| ---------------- | ------ | --------------------------------------------------------------------------------------- |
+| `annotator_kind` | String | Who created this annotation: "HUMAN", "LLM", or "CODE" (default: "HUMAN")               |
+| `identifier`     | String | Unique identifier for upsert behavior (updates existing if same name+entity+identifier) |
+| `metadata`       | Object | Custom metadata as key-value pairs                                                      |
+
+## Annotator Kinds
+
+| Kind    | Description                    | Example                           |
+| ------- | ------------------------------ | --------------------------------- |
+| `HUMAN` | Manual feedback from a person  | User ratings, expert labels       |
+| `LLM`   | Automated feedback from an LLM | GPT-4 evaluating response quality |
+| `CODE`  | Automated feedback from code   | Rule-based checks, heuristics     |
+
+## Examples
+
+**Quality Assessment:**
+
+- `quality` - Overall quality (label: good/fair/poor, score: 0-1)
+- `correctness` - Factual accuracy (label: correct/incorrect, score: 0-1)
+- `helpfulness` - User satisfaction (label: helpful/not_helpful, score: 0-1)
+
+**RAG-Specific:**
+
+- `relevance` - Document relevance to query (label: relevant/irrelevant, score: 0-1)
+- `faithfulness` - Answer grounded in context (label: faithful/unfaithful, score: 0-1)
+
+**Safety:**
+
+- `toxicity` - Contains harmful content (score: 0-1)
+- `pii_detected` - Contains personally identifiable information (label: yes/no)
--- a/plugins/phoenix/skills/phoenix-tracing/references/annotations-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/annotations-python.md
@@ -0,0 +1,114 @@
+# Python SDK Annotation Patterns
+
+Add feedback to spans, traces, documents, and sessions using the Python client.
+
+## Client Setup
+
+```python
+from phoenix.client import Client
+client = Client()  # Default: http://localhost:6006
+```
+
+## Span Annotations
+
+Add feedback to individual spans:
+
+```python
+client.spans.add_span_annotation(
+    span_id="abc123",
+    annotation_name="quality",
+    annotator_kind="HUMAN",
+    label="high_quality",
+    score=0.95,
+    explanation="Accurate and well-formatted",
+    metadata={"reviewer": "alice"},
+    sync=True
+)
+```
+
+## Document Annotations
+
+Rate individual documents in RETRIEVER spans:
+
+```python
+client.spans.add_document_annotation(
+    span_id="retriever_span",
+    document_position=0,  # 0-based index
+    annotation_name="relevance",
+    annotator_kind="LLM",
+    label="relevant",
+    score=0.95
+)
+```
+
+## Trace Annotations
+
+Feedback on entire traces:
+
+```python
+client.traces.add_trace_annotation(
+    trace_id="trace_abc",
+    annotation_name="correctness",
+    annotator_kind="HUMAN",
+    label="correct",
+    score=1.0
+)
+```
+
+## Session Annotations
+
+Feedback on multi-turn conversations:
+
+```python
+client.sessions.add_session_annotation(
+    session_id="session_xyz",
+    annotation_name="user_satisfaction",
+    annotator_kind="HUMAN",
+    label="satisfied",
+    score=0.85
+)
+```
+
+## RAG Pipeline Example
+
+```python
+from phoenix.client import Client
+from phoenix.client.resources.spans import SpanDocumentAnnotationData
+
+client = Client()
+
+# Document relevance (batch)
+client.spans.log_document_annotations(
+    document_annotations=[
+        SpanDocumentAnnotationData(
+            name="relevance", span_id="retriever_span", document_position=i,
+            annotator_kind="LLM", result={"label": label, "score": score}
+        )
+        for i, (label, score) in enumerate([
+            ("relevant", 0.95), ("relevant", 0.80), ("irrelevant", 0.10)
+        ])
+    ]
+)
+
+# LLM response quality
+client.spans.add_span_annotation(
+    span_id="llm_span",
+    annotation_name="faithfulness",
+    annotator_kind="LLM",
+    label="faithful",
+    score=0.90
+)
+
+# Overall trace quality
+client.traces.add_trace_annotation(
+    trace_id="trace_123",
+    annotation_name="correctness",
+    annotator_kind="HUMAN",
+    label="correct",
+    score=1.0
+)
+```
+
+## API Reference
+
+- [Python Client API](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
--- a/plugins/phoenix/skills/phoenix-tracing/references/annotations-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/annotations-typescript.md
@@ -0,0 +1,137 @@
+# TypeScript SDK Annotation Patterns
+
+Add feedback to spans, traces, documents, and sessions using the TypeScript client.
+
+## Client Setup
+
+```typescript
+import { createClient } from "phoenix-client";
+const client = createClient();  // Default: http://localhost:6006
+```
+
+## Span Annotations
+
+Add feedback to individual spans:
+
+```typescript
+import { addSpanAnnotation } from "phoenix-client";
+
+await addSpanAnnotation({
+  client,
+  spanAnnotation: {
+    spanId: "abc123",
+    name: "quality",
+    annotatorKind: "HUMAN",
+    label: "high_quality",
+    score: 0.95,
+    explanation: "Accurate and well-formatted",
+    metadata: { reviewer: "alice" }
+  },
+  sync: true
+});
+```
+
+## Document Annotations
+
+Rate individual documents in RETRIEVER spans:
+
+```typescript
+import { addDocumentAnnotation } from "phoenix-client";
+
+await addDocumentAnnotation({
+  client,
+  documentAnnotation: {
+    spanId: "retriever_span",
+    documentPosition: 0,  // 0-based index
+    name: "relevance",
+    annotatorKind: "LLM",
+    label: "relevant",
+    score: 0.95
+  }
+});
+```
+
+## Trace Annotations
+
+Feedback on entire traces:
+
+```typescript
+import { addTraceAnnotation } from "phoenix-client";
+
+await addTraceAnnotation({
+  client,
+  traceAnnotation: {
+    traceId: "trace_abc",
+    name: "correctness",
+    annotatorKind: "HUMAN",
+    label: "correct",
+    score: 1.0
+  }
+});
+```
+
+## Session Annotations
+
+Feedback on multi-turn conversations:
+
+```typescript
+import { addSessionAnnotation } from "phoenix-client";
+
+await addSessionAnnotation({
+  client,
+  sessionAnnotation: {
+    sessionId: "session_xyz",
+    name: "user_satisfaction",
+    annotatorKind: "HUMAN",
+    label: "satisfied",
+    score: 0.85
+  }
+});
+```
+
+## RAG Pipeline Example
+
+```typescript
+import { createClient, logDocumentAnnotations, addSpanAnnotation, addTraceAnnotation } from "phoenix-client";
+
+const client = createClient();
+
+// Document relevance (batch)
+await logDocumentAnnotations({
+  client,
+  documentAnnotations: [
+    { spanId: "retriever_span", documentPosition: 0, name: "relevance",
+      annotatorKind: "LLM", label: "relevant", score: 0.95 },
+    { spanId: "retriever_span", documentPosition: 1, name: "relevance",
+      annotatorKind: "LLM", label: "relevant", score: 0.80 }
+  ]
+});
+
+// LLM response quality
+await addSpanAnnotation({
+  client,
+  spanAnnotation: {
+    spanId: "llm_span",
+    name: "faithfulness",
+    annotatorKind: "LLM",
+    label: "faithful",
+    score: 0.90
+  }
+});
+
+// Overall trace quality
+await addTraceAnnotation({
+  client,
+  traceAnnotation: {
+    traceId: "trace_123",
+    name: "correctness",
+    annotatorKind: "HUMAN",
+    label: "correct",
+    score: 1.0
+  }
+});
+```
+
+## API Reference
+
+- [TypeScript Client API](https://arize-ai.github.io/phoenix/)
--- a/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-flattening.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-flattening.md
@@ -0,0 +1,58 @@
+# Flattening Convention
+
+OpenInference flattens nested data structures into dot-notation attributes for database compatibility, OpenTelemetry compatibility, and simple querying.
+
+## Flattening Rules
+
+**Objects → Dot Notation**
+
+```javascript
+{ llm: { model_name: "gpt-4", token_count: { prompt: 10, completion: 20 } } }
+// becomes
+{ "llm.model_name": "gpt-4", "llm.token_count.prompt": 10, "llm.token_count.completion": 20 }
+```
+
+**Arrays → Zero-Indexed Notation**
+
+```javascript
+{ llm: { input_messages: [{ role: "user", content: "Hi" }] } }
+// becomes
+{ "llm.input_messages.0.message.role": "user", "llm.input_messages.0.message.content": "Hi" }
+```
+
+**Message Convention: `.message.` segment required**
+
+```
+llm.input_messages.{index}.message.{field}
+llm.input_messages.0.message.tool_calls.0.tool_call.function.name
+```
+
+## Complete Example
+
+```javascript
+// Original
+{
+  openinference: { span: { kind: "LLM" } },
+  llm: {
+    model_name: "claude-3-5-sonnet-20241022",
+    invocation_parameters: { temperature: 0.7, max_tokens: 1000 },
+    input_messages: [{ role: "user", content: "Tell me a joke" }],
+    output_messages: [{ role: "assistant", content: "Why did the chicken cross the road?" }],
+    token_count: { prompt: 5, completion: 10, total: 15 }
+  }
+}
+
+// Flattened (stored in Phoenix spans.attributes JSONB)
+{
+  "openinference.span.kind": "LLM",
+  "llm.model_name": "claude-3-5-sonnet-20241022",
+  "llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1000}",
+  "llm.input_messages.0.message.role": "user",
+  "llm.input_messages.0.message.content": "Tell me a joke",
+  "llm.output_messages.0.message.role": "assistant",
+  "llm.output_messages.0.message.content": "Why did the chicken cross the road?",
+  "llm.token_count.prompt": 5,
+  "llm.token_count.completion": 10,
+  "llm.token_count.total": 15
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-overview.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-overview.md
@@ -0,0 +1,53 @@
+# Overview and Traces & Spans
+
+This document covers the fundamental concepts of OpenInference traces and spans in Phoenix.
+
+## Overview
+
+OpenInference is a set of semantic conventions for AI and LLM applications based on OpenTelemetry. Phoenix uses these conventions to capture, store, and analyze traces from AI applications.
+
+**Key Concepts:**
+
+- **Traces** represent end-to-end requests through your application
+- **Spans** represent individual operations within a trace (LLM calls, retrievals, tool invocations)
+- **Attributes** are key-value pairs attached to spans using flattened, dot-notation paths
+- **Span Kinds** categorize the type of operation (LLM, RETRIEVER, TOOL, etc.)
+
+## Traces and Spans
+
+### Trace Hierarchy
+
+A **trace** is a tree of **spans** representing a complete request:
+
+```
+Trace ID: abc123
+├─ Span 1: CHAIN (root span, parent_id = null)
+│  ├─ Span 2: RETRIEVER (parent_id = span_1_id)
+│  │  └─ Span 3: EMBEDDING (parent_id = span_2_id)
+│  └─ Span 4: LLM (parent_id = span_1_id)
+│     └─ Span 5: TOOL (parent_id = span_4_id)
+```
+
+### Context Propagation
+
+Spans maintain parent-child relationships via:
+
+- `trace_id` - Same for all spans in a trace
+- `span_id` - Unique identifier for this span
+- `parent_id` - References parent span's `span_id` (null for root spans)
+
+Phoenix uses these relationships to:
+
+- Build the span tree visualization in the UI
+- Calculate cumulative metrics (tokens, errors) up the tree
+- Enable nested querying (e.g., "find CHAIN spans containing LLM spans with errors")
+
+### Span Lifecycle
+
+Each span has:
+
+- `start_time` - When the operation began (Unix timestamp in nanoseconds)
+- `end_time` - When the operation completed
+- `status_code` - OK, ERROR, or UNSET
+- `status_message` - Optional error message
+- `attributes` - object with all semantic convention attributes
--- a/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-required-attributes.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-required-attributes.md
@@ -0,0 +1,64 @@
+# Required and Recommended Attributes
+
+This document covers the required attribute and highly recommended attributes for all OpenInference spans.
+
+## Required Attribute
+
+**Every span MUST have exactly one required attribute:**
+
+```json
+{
+  "openinference.span.kind": "LLM"
+}
+```
+
+## Highly Recommended Attributes
+
+While not strictly required, these attributes are **highly recommended** on all spans as they:
+- Enable evaluation and quality assessment
+- Help understand information flow through your application
+- Make traces more useful for debugging
+
+### Input/Output Values
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Input to the operation (prompt, query, document) |
+| `output.value` | String | Output from the operation (response, result, answer) |
+
+**Example:**
+```json
+{
+  "openinference.span.kind": "LLM",
+  "input.value": "What is the capital of France?",
+  "output.value": "The capital of France is Paris."
+}
+```
+
+**Why these matter:**
+- **Evaluations**: Many evaluators (faithfulness, relevance, hallucination detection) require both input and output to assess quality
+- **Information flow**: Seeing inputs/outputs makes it easy to trace how data transforms through your application
+- **Debugging**: When something goes wrong, having the actual input/output makes root cause analysis much faster
+- **Analytics**: Enables pattern analysis across similar inputs or outputs
+
+**Phoenix Behavior:**
+- Input/output displayed prominently in span details
+- Evaluators can automatically access these values
+- Search/filter traces by input or output content
+- Export inputs/outputs for fine-tuning datasets
+
+## Valid Span Kinds
+
+There are exactly **9 valid span kinds** in OpenInference:
+
+| Span Kind | Purpose | Common Use Case |
+|-----------|---------|-----------------|
+| `LLM` | Language model inference | OpenAI, Anthropic, local LLM calls |
+| `EMBEDDING` | Vector generation | Text-to-vector conversion |
+| `CHAIN` | Application flow orchestration | LangChain chains, custom workflows |
+| `RETRIEVER` | Document/context retrieval | Vector DB queries, semantic search |
+| `RERANKER` | Result reordering | Rerank retrieved documents |
+| `TOOL` | External tool invocation | API calls, function execution |
+| `AGENT` | Autonomous reasoning | ReAct agents, planning loops |
+| `GUARDRAIL` | Safety/policy checks | Content moderation, PII detection |
+| `EVALUATOR` | Quality assessment | Answer relevance, faithfulness scoring |
--- a/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-universal-attributes.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/fundamentals-universal-attributes.md
@@ -0,0 +1,72 @@
+# Universal Attributes
+
+This document covers attributes that can be used on any span kind in OpenInference.
+
+## Overview
+
+These attributes can be used on **any span kind** to provide additional context, tracking, and metadata.
+
+## Input/Output
+
+| Attribute          | Type   | Description                                          |
+| ------------------ | ------ | ---------------------------------------------------- |
+| `input.value`      | String | Input to the operation (prompt, query, document)     |
+| `input.mime_type`  | String | MIME type (e.g., "text/plain", "application/json")   |
+| `output.value`     | String | Output from the operation (response, vector, result) |
+| `output.mime_type` | String | MIME type of output                                  |
+
+### Why Capture I/O?
+
+**Always capture input/output for evaluation-ready spans:**
+- Phoenix evaluators (faithfulness, relevance, Q&A correctness) require `input.value` and `output.value`
+- Phoenix UI displays I/O prominently in trace views for debugging
+- Enables exporting I/O for creating fine-tuning datasets
+- Provides complete context for analyzing agent behavior
+
+**Example attributes:**
+
+```json
+{
+  "openinference.span.kind": "CHAIN",
+  "input.value": "What is the weather?",
+  "input.mime_type": "text/plain",
+  "output.value": "I don't have access to weather data.",
+  "output.mime_type": "text/plain"
+}
+```
+
+**See language-specific implementation:**
+- TypeScript: `instrumentation-manual-typescript.md`
+- Python: `instrumentation-manual-python.md`
+
+## Session and User Tracking
+
+| Attribute    | Type   | Description                                    |
+| ------------ | ------ | ---------------------------------------------- |
+| `session.id` | String | Session identifier for grouping related traces |
+| `user.id`    | String | User identifier for per-user analysis          |
+
+**Example:**
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "session.id": "session_abc123",
+  "user.id": "user_xyz789"
+}
+```
+
+## Metadata
+
+| Attribute  | Type   | Description                                |
+| ---------- | ------ | ------------------------------------------ |
+| `metadata` | string | JSON-serialized object of key-value pairs  |
+
+**Example:**
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "metadata": "{\"environment\": \"production\", \"model_version\": \"v2.1\", \"cost_center\": \"engineering\"}"
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-auto-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-auto-python.md
@@ -0,0 +1,85 @@
+# Phoenix Tracing: Auto-Instrumentation (Python)
+
+**Automatically create spans for LLM calls without code changes.**
+
+## Overview
+
+Auto-instrumentation patches supported libraries at runtime to create spans automatically. Use for supported frameworks (LangChain, LlamaIndex, OpenAI SDK, etc.). For custom logic, manual-instrumentation-python.md.
+
+## Supported Frameworks
+
+**Python:**
+
+- LLM SDKs: OpenAI, Anthropic, Bedrock, Mistral, Vertex AI, Groq, Ollama
+- Frameworks: LangChain, LlamaIndex, DSPy, CrewAI, Instructor, Haystack
+- Install: `pip install openinference-instrumentation-{name}`
+
+## Setup
+
+**Install and enable:**
+
+```bash
+pip install arize-phoenix-otel
+pip install openinference-instrumentation-openai  # Add others as needed
+```
+
+```python
+from phoenix.otel import register
+
+register(project_name="my-app", auto_instrument=True)  # Discovers all installed instrumentors
+```
+
+**Example:**
+
+```python
+from phoenix.otel import register
+from openai import OpenAI
+
+register(project_name="my-app", auto_instrument=True)
+
+client = OpenAI()
+response = client.chat.completions.create(
+    model="gpt-4",
+    messages=[{"role": "user", "content": "Hello!"}]
+)
+```
+
+Traces appear in Phoenix UI with model, input/output, tokens, timing automatically captured. See span kind files for full attribute schemas.
+
+**Selective instrumentation** (explicit control):
+
+```python
+from phoenix.otel import register
+from openinference.instrumentation.openai import OpenAIInstrumentor
+
+tracer_provider = register(project_name="my-app")  # No auto_instrument
+OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)
+```
+
+## Limitations
+
+Auto-instrumentation does NOT capture:
+
+- Custom business logic
+- Internal function calls
+
+**Example:**
+
+```python
+def my_custom_workflow(query: str) -> str:
+    preprocessed = preprocess(query)  # Not traced
+    response = client.chat.completions.create(...)  # Traced (auto)
+    postprocessed = postprocess(response)  # Not traced
+    return postprocessed
+```
+
+**Solution:** Add manual instrumentation:
+
+```python
+@tracer.chain
+def my_custom_workflow(query: str) -> str:
+    preprocessed = preprocess(query)
+    response = client.chat.completions.create(...)
+    postprocessed = postprocess(response)
+    return postprocessed
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-auto-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-auto-typescript.md
@@ -0,0 +1,87 @@
+# Auto-Instrumentation (TypeScript)
+
+Automatically create spans for LLM calls without code changes.
+
+## Supported Frameworks
+
+- **LLM SDKs:** OpenAI
+- **Frameworks:** LangChain
+- **Install:** `npm install @arizeai/openinference-instrumentation-{name}`
+
+## Setup
+
+**CommonJS (automatic):**
+
+```javascript
+const { register } = require("@arizeai/phoenix-otel");
+const OpenAI = require("openai");
+
+register({ projectName: "my-app" });
+
+const client = new OpenAI();
+```
+
+**ESM (manual required):**
+
+```typescript
+import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
+import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
+import OpenAI from "openai";
+
+register({ projectName: "my-app" });
+
+const instrumentation = new OpenAIInstrumentation();
+instrumentation.manuallyInstrument(OpenAI);
+registerInstrumentations({ instrumentations: [instrumentation] });
+```
+
+**Why:** ESM imports are hoisted before `register()` runs.
+
+## Limitations
+
+**What auto-instrumentation does NOT capture:**
+
+```typescript
+async function myWorkflow(query: string): Promise<string> {
+  const preprocessed = await preprocess(query);        // Not traced
+  const response = await client.chat.completions.create(...);  // Traced (auto)
+  const postprocessed = await postprocess(response);   // Not traced
+  return postprocessed;
+}
+```
+
+**Solution:** Add manual instrumentation for custom logic:
+
+```typescript
+import { traceChain } from "@arizeai/openinference-core";
+
+const myWorkflow = traceChain(
+  async (query: string): Promise<string> => {
+    const preprocessed = await preprocess(query);
+    const response = await client.chat.completions.create(...);
+    const postprocessed = await postprocess(response);
+    return postprocessed;
+  },
+  { name: "my-workflow" }
+);
+```
+
+## Combining Auto + Manual
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+import { traceChain } from "@arizeai/openinference-core";
+
+register({ projectName: "my-app" });
+
+const client = new OpenAI();
+
+const workflow = traceChain(
+  async (query: string) => {
+    const preprocessed = await preprocess(query);
+    const response = await client.chat.completions.create(...);  // Auto-instrumented
+    return postprocess(response);
+  },
+  { name: "my-workflow" }
+);
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-manual-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-manual-python.md
@@ -0,0 +1,182 @@
+# Manual Instrumentation (Python)
+
+Add custom spans using decorators or context managers for fine-grained tracing control.
+
+## Setup
+
+```bash
+pip install arize-phoenix-otel
+```
+
+```python
+from phoenix.otel import register
+tracer_provider = register(project_name="my-app")
+tracer = tracer_provider.get_tracer(__name__)
+```
+
+## Quick Reference
+
+| Span Kind | Decorator | Use Case |
+|-----------|-----------|----------|
+| CHAIN | `@tracer.chain` | Orchestration, workflows, pipelines |
+| RETRIEVER | `@tracer.retriever` | Vector search, document retrieval |
+| TOOL | `@tracer.tool` | External API calls, function execution |
+| AGENT | `@tracer.agent` | Multi-step reasoning, planning |
+| LLM | `@tracer.llm` | LLM API calls (manual only) |
+| EMBEDDING | `@tracer.embedding` | Embedding generation |
+| RERANKER | `@tracer.reranker` | Document re-ranking |
+| GUARDRAIL | `@tracer.guardrail` | Safety checks, content moderation |
+| EVALUATOR | `@tracer.evaluator` | LLM evaluation, quality checks |
+
+## Decorator Approach (Recommended)
+
+**Use for:** Full function instrumentation, automatic I/O capture
+
+```python
+@tracer.chain
+def rag_pipeline(query: str) -> str:
+    docs = retrieve_documents(query)
+    ranked = rerank(docs, query)
+    return generate_response(ranked, query)
+
+@tracer.retriever
+def retrieve_documents(query: str) -> list[dict]:
+    results = vector_db.search(query, top_k=5)
+    return [{"content": doc.text, "score": doc.score} for doc in results]
+
+@tracer.tool
+def get_weather(city: str) -> str:
+    response = requests.get(f"https://api.weather.com/{city}")
+    return response.json()["weather"]
+```
+
+**Custom span names:**
+
+```python
+@tracer.chain(name="rag-pipeline-v2")
+def my_workflow(query: str) -> str:
+    return process(query)
+```
+
+## Context Manager Approach
+
+**Use for:** Partial function instrumentation, custom attributes, dynamic control
+
+```python
+from opentelemetry.trace import Status, StatusCode
+import json
+
+def retrieve_with_metadata(query: str):
+    with tracer.start_as_current_span(
+        "vector_search",
+        openinference_span_kind="retriever"
+    ) as span:
+        span.set_attribute("input.value", query)
+
+        results = vector_db.search(query, top_k=5)
+
+        documents = [
+            {
+                "document.id": doc.id,
+                "document.content": doc.text,
+                "document.score": doc.score
+            }
+            for doc in results
+        ]
+        span.set_attribute("retrieval.documents", json.dumps(documents))
+        span.set_status(Status(StatusCode.OK))
+
+        return documents
+```
+
+## Capturing Input/Output
+
+**Always capture I/O for evaluation-ready spans.**
+
+### Automatic I/O Capture (Decorators)
+
+Decorators automatically capture input arguments and return values:
+
+```python  theme={null}
+@tracer.chain
+def handle_query(user_input: str) -> str:
+    result = agent.generate(user_input)
+    return result.text
+
+# Automatically captures:
+# - input.value: user_input
+# - output.value: result.text
+# - input.mime_type / output.mime_type: auto-detected
+```
+
+### Manual I/O Capture (Context Manager)
+
+Use `set_input()` and `set_output()` for simple I/O capture:
+
+```python  theme={null}
+from opentelemetry.trace import Status, StatusCode
+
+def handle_query(user_input: str) -> str:
+    with tracer.start_as_current_span(
+        "query.handler",
+        openinference_span_kind="chain"
+    ) as span:
+        span.set_input(user_input)
+
+        result = agent.generate(user_input)
+
+        span.set_output(result.text)
+        span.set_status(Status(StatusCode.OK))
+
+        return result.text
+```
+
+**What gets captured:**
+
+```json
+{
+  "input.value": "What is 2+2?",
+  "input.mime_type": "text/plain",
+  "output.value": "2+2 equals 4.",
+  "output.mime_type": "text/plain"
+}
+```
+
+**Why this matters:**
+- Phoenix evaluators require `input.value` and `output.value`
+- Phoenix UI displays I/O prominently for debugging
+- Enables exporting data for fine-tuning datasets
+
+### Custom I/O with Additional Metadata
+
+Use `set_attribute()` for custom attributes alongside I/O:
+
+```python  theme={null}
+def process_query(query: str):
+    with tracer.start_as_current_span(
+        "query.process",
+        openinference_span_kind="chain"
+    ) as span:
+        # Standard I/O
+        span.set_input(query)
+
+        # Custom metadata
+        span.set_attribute("input.length", len(query))
+
+        result = llm.generate(query)
+
+        # Standard output
+        span.set_output(result.text)
+
+        # Custom metadata
+        span.set_attribute("output.tokens", result.usage.total_tokens)
+        span.set_status(Status(StatusCode.OK))
+
+        return result
+```
+
+## See Also
+
+- **Span attributes:** `span-chain.md`, `span-retriever.md`, `span-tool.md`, `span-llm.md`, `span-agent.md`, `span-embedding.md`, `span-reranker.md`, `span-guardrail.md`, `span-evaluator.md`
+- **Auto-instrumentation:** `instrumentation-auto-python.md` for framework integrations
+- **API docs:** https://docs.arize.com/phoenix/tracing/manual-instrumentation
--- a/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-manual-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/instrumentation-manual-typescript.md
@@ -0,0 +1,172 @@
+# Manual Instrumentation (TypeScript)
+
+Add custom spans using convenience wrappers or withSpan for fine-grained tracing control.
+
+## Setup
+
+```bash
+npm install @arizeai/phoenix-otel @arizeai/openinference-core
+```
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+register({ projectName: "my-app" });
+```
+
+## Quick Reference
+
+| Span Kind | Method | Use Case |
+|-----------|--------|----------|
+| CHAIN | `traceChain` | Workflows, pipelines, orchestration |
+| AGENT | `traceAgent` | Multi-step reasoning, planning |
+| TOOL | `traceTool` | External APIs, function calls |
+| RETRIEVER | `withSpan` | Vector search, document retrieval |
+| LLM | `withSpan` | LLM API calls (prefer auto-instrumentation) |
+| EMBEDDING | `withSpan` | Embedding generation |
+| RERANKER | `withSpan` | Document re-ranking |
+| GUARDRAIL | `withSpan` | Safety checks, content moderation |
+| EVALUATOR | `withSpan` | LLM evaluation |
+
+## Convenience Wrappers
+
+```typescript
+import { traceChain, traceAgent, traceTool } from "@arizeai/openinference-core";
+
+// CHAIN - workflows
+const pipeline = traceChain(
+  async (query: string) => {
+    const docs = await retrieve(query);
+    return await generate(docs, query);
+  },
+  { name: "rag-pipeline" }
+);
+
+// AGENT - reasoning
+const agent = traceAgent(
+  async (question: string) => {
+    const thought = await llm.generate(`Think: ${question}`);
+    return await processThought(thought);
+  },
+  { name: "my-agent" }
+);
+
+// TOOL - function calls
+const getWeather = traceTool(
+  async (city: string) => fetch(`/api/weather/${city}`).then(r => r.json()),
+  { name: "get-weather" }
+);
+```
+
+## withSpan for Other Kinds
+
+```typescript
+import { withSpan, getInputAttributes, getRetrieverAttributes } from "@arizeai/openinference-core";
+
+// RETRIEVER with custom attributes
+const retrieve = withSpan(
+  async (query: string) => {
+    const results = await vectorDb.search(query, { topK: 5 });
+    return results.map(doc => ({ content: doc.text, score: doc.score }));
+  },
+  {
+    kind: "RETRIEVER",
+    name: "vector-search",
+    processInput: (query) => getInputAttributes(query),
+    processOutput: (docs) => getRetrieverAttributes({ documents: docs })
+  }
+);
+```
+
+**Options:**
+
+```typescript
+withSpan(fn, {
+  kind: "RETRIEVER",              // OpenInference span kind
+  name: "span-name",              // Span name (defaults to function name)
+  processInput: (args) => {},     // Transform input to attributes
+  processOutput: (result) => {},  // Transform output to attributes
+  attributes: { key: "value" }    // Static attributes
+});
+```
+
+## Capturing Input/Output
+
+**Always capture I/O for evaluation-ready spans.** Use `getInputAttributes` and `getOutputAttributes` helpers for automatic MIME type detection:
+
+```typescript
+import {
+  getInputAttributes,
+  getOutputAttributes,
+  withSpan,
+} from "@arizeai/openinference-core";
+
+const handleQuery = withSpan(
+  async (userInput: string) => {
+    const result = await agent.generate({ prompt: userInput });
+    return result;
+  },
+  {
+    name: "query.handler",
+    kind: "CHAIN",
+    // Use helpers - automatic MIME type detection
+    processInput: (input) => getInputAttributes(input),
+    processOutput: (result) => getOutputAttributes(result.text),
+  }
+);
+
+await handleQuery("What is 2+2?");
+```
+
+**What gets captured:**
+
+```json
+{
+  "input.value": "What is 2+2?",
+  "input.mime_type": "text/plain",
+  "output.value": "2+2 equals 4.",
+  "output.mime_type": "text/plain"
+}
+```
+
+**Helper behavior:**
+- Strings → `text/plain`
+- Objects/Arrays → `application/json` (automatically serialized)
+- `undefined`/`null` → No attributes set
+
+**Why this matters:**
+- Phoenix evaluators require `input.value` and `output.value`
+- Phoenix UI displays I/O prominently for debugging
+- Enables exporting data for fine-tuning datasets
+
+### Custom I/O Processing
+
+Add custom metadata alongside standard I/O attributes:
+
+```typescript
+const processWithMetadata = withSpan(
+  async (query: string) => {
+    const result = await llm.generate(query);
+    return result;
+  },
+  {
+    name: "query.process",
+    kind: "CHAIN",
+    processInput: (query) => ({
+      "input.value": query,
+      "input.mime_type": "text/plain",
+      "input.length": query.length,  // Custom attribute
+    }),
+    processOutput: (result) => ({
+      "output.value": result.text,
+      "output.mime_type": "text/plain",
+      "output.tokens": result.usage?.totalTokens,  // Custom attribute
+    }),
+  }
+);
+```
+
+## See Also
+
+- **Span attributes:** `span-chain.md`, `span-retriever.md`, `span-tool.md`, etc.
+- **Attribute helpers:** https://docs.arize.com/phoenix/tracing/manual-instrumentation-typescript#attribute-helpers
+- **Auto-instrumentation:** `instrumentation-auto-typescript.md` for framework integrations
--- a/plugins/phoenix/skills/phoenix-tracing/references/metadata-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/metadata-python.md
@@ -0,0 +1,87 @@
+# Phoenix Tracing: Custom Metadata (Python)
+
+Add custom attributes to spans for richer observability.
+
+## Install
+
+```bash
+pip install openinference-instrumentation
+```
+
+## Session
+
+```python
+from openinference.instrumentation import using_session
+
+with using_session(session_id="my-session-id"):
+    # Spans get: "session.id" = "my-session-id"
+    ...
+```
+
+## User
+
+```python
+from openinference.instrumentation import using_user
+
+with using_user("my-user-id"):
+    # Spans get: "user.id" = "my-user-id"
+    ...
+```
+
+## Metadata
+
+```python
+from openinference.instrumentation import using_metadata
+
+with using_metadata({"key": "value", "experiment_id": "exp_123"}):
+    # Spans get: "metadata" = '{"key": "value", "experiment_id": "exp_123"}'
+    ...
+```
+
+## Tags
+
+```python
+from openinference.instrumentation import using_tags
+
+with using_tags(["tag_1", "tag_2"]):
+    # Spans get: "tag.tags" = '["tag_1", "tag_2"]'
+    ...
+```
+
+## Combined (using_attributes)
+
+```python
+from openinference.instrumentation import using_attributes
+
+with using_attributes(
+    session_id="my-session-id",
+    user_id="my-user-id",
+    metadata={"environment": "production"},
+    tags=["prod", "v2"],
+    prompt_template="Answer: {question}",
+    prompt_template_version="v1.0",
+    prompt_template_variables={"question": "What is Phoenix?"},
+):
+    # All attributes applied to spans in this context
+    ...
+```
+
+## On a Single Span
+
+```python
+span.set_attribute("metadata", json.dumps({"key": "value"}))
+span.set_attribute("user.id", "user_123")
+span.set_attribute("session.id", "session_456")
+```
+
+## As Decorators
+
+All context managers can be used as decorators:
+
+```python
+@using_session(session_id="my-session-id")
+@using_user("my-user-id")
+@using_metadata({"env": "prod"})
+def my_function():
+    ...
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/metadata-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/metadata-typescript.md
@@ -0,0 +1,50 @@
+# Phoenix Tracing: Custom Metadata (TypeScript)
+
+Add custom attributes to spans for richer observability.
+
+## Using Context (Propagates to All Child Spans)
+
+```typescript
+import { context } from "@arizeai/phoenix-otel";
+import { setMetadata } from "@arizeai/openinference-core";
+
+context.with(
+  setMetadata(context.active(), {
+    experiment_id: "exp_123",
+    model_version: "gpt-4-1106-preview",
+    environment: "production",
+  }),
+  async () => {
+    // All spans created within this block will have:
+    // "metadata" = '{"experiment_id": "exp_123", ...}'
+    await myApp.run(query);
+  }
+);
+```
+
+## On a Single Span
+
+```typescript
+import { traceChain } from "@arizeai/openinference-core";
+import { trace } from "@arizeai/phoenix-otel";
+
+const myFunction = traceChain(
+  async (input: string) => {
+    const span = trace.getActiveSpan();
+
+    span?.setAttribute(
+      "metadata",
+      JSON.stringify({
+        experiment_id: "exp_123",
+        model_version: "gpt-4-1106-preview",
+        environment: "production",
+      })
+    );
+
+    return result;
+  },
+  { name: "my-function" }
+);
+
+await myFunction("hello");
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/production-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/production-python.md
@@ -0,0 +1,58 @@
+# Phoenix Tracing: Production Guide (Python)
+
+**CRITICAL: Configure batching, data masking, and span filtering for production deployment.**
+
+## Metadata
+
+| Attribute | Value |
+|-----------|-------|
+| Priority | Critical - production readiness |
+| Impact | Security, Performance |
+| Setup Time | 5-15 min |
+
+## Batch Processing
+
+**Enable batch processing for production efficiency.** Batching reduces network overhead by sending spans in groups rather than individually.
+
+## Data Masking (PII Protection)
+
+**Environment variables:**
+
+```bash
+export OPENINFERENCE_HIDE_INPUTS=true          # Hide input.value
+export OPENINFERENCE_HIDE_OUTPUTS=true         # Hide output.value
+export OPENINFERENCE_HIDE_INPUT_MESSAGES=true  # Hide LLM input messages
+export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
+export OPENINFERENCE_HIDE_INPUT_IMAGES=true    # Hide image content
+export OPENINFERENCE_HIDE_INPUT_TEXT=true      # Hide embedding text
+export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000  # Limit image size
+```
+
+**Python TraceConfig:**
+
+```python
+from phoenix.otel import register
+from openinference.instrumentation import TraceConfig
+
+config = TraceConfig(
+    hide_inputs=True,
+    hide_outputs=True,
+    hide_input_messages=True
+)
+register(trace_config=config)
+```
+
+**Precedence:** Code > Environment variables > Defaults
+
+---
+
+## Span Filtering
+
+**Suppress specific code blocks:**
+
+```python
+from phoenix.otel import suppress_tracing
+
+with suppress_tracing():
+    internal_logging()  # No spans generated
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/production-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/production-typescript.md
@@ -0,0 +1,148 @@
+# Phoenix Tracing: Production Guide (TypeScript)
+
+**CRITICAL: Configure batching, data masking, and span filtering for production deployment.**
+
+## Metadata
+
+| Attribute | Value |
+|-----------|-------|
+| Priority | Critical - production readiness |
+| Impact | Security, Performance |
+| Setup Time | 5-15 min |
+
+## Batch Processing
+
+**Enable batch processing for production efficiency.** Batching reduces network overhead by sending spans in groups rather than individually.
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+
+const provider = register({
+  projectName: "my-app",
+  batch: true,  // Production default
+});
+```
+
+### Shutdown Handling
+
+**CRITICAL:** Spans may not be exported if still queued in the processor when your process exits. Call `provider.shutdown()` to explicitly flush before exit.
+
+```typescript
+// Explicit shutdown to flush queued spans
+const provider = register({
+  projectName: "my-app",
+  batch: true,
+});
+
+async function main() {
+  await doWork();
+  await provider.shutdown();  // Flush spans before exit
+}
+
+main().catch(async (error) => {
+  console.error(error);
+  await provider.shutdown();  // Flush on error too
+  process.exit(1);
+});
+```
+
+**Graceful termination signals:**
+
+```typescript
+// Graceful shutdown on SIGTERM
+const provider = register({
+  projectName: "my-server",
+  batch: true,
+});
+
+process.on("SIGTERM", async () => {
+  await provider.shutdown();
+  process.exit(0);
+});
+```
+
+---
+
+## Data Masking (PII Protection)
+
+**Environment variables:**
+
+```bash
+export OPENINFERENCE_HIDE_INPUTS=true          # Hide input.value
+export OPENINFERENCE_HIDE_OUTPUTS=true         # Hide output.value
+export OPENINFERENCE_HIDE_INPUT_MESSAGES=true  # Hide LLM input messages
+export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
+export OPENINFERENCE_HIDE_INPUT_IMAGES=true    # Hide image content
+export OPENINFERENCE_HIDE_INPUT_TEXT=true      # Hide embedding text
+export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000  # Limit image size
+```
+
+**TypeScript TraceConfig:**
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
+
+const traceConfig = {
+  hideInputs: true,
+  hideOutputs: true,
+  hideInputMessages: true
+};
+
+const instrumentation = new OpenAIInstrumentation({ traceConfig });
+```
+
+**Precedence:** Code > Environment variables > Defaults
+
+---
+
+## Span Filtering
+
+**Suppress specific code blocks:**
+
+```typescript
+import { suppressTracing } from "@opentelemetry/core";
+import { context } from "@opentelemetry/api";
+
+await context.with(suppressTracing(context.active()), async () => {
+  internalLogging(); // No spans generated
+});
+```
+
+**Sampling:**
+
+```bash
+export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
+export OTEL_TRACES_SAMPLER_ARG="0.1"  # Sample 10%
+```
+
+---
+
+## Error Handling
+
+```typescript
+import { SpanStatusCode } from "@opentelemetry/api";
+
+try {
+  result = await riskyOperation();
+  span?.setStatus({ code: SpanStatusCode.OK });
+} catch (e) {
+  span?.recordException(e);
+  span?.setStatus({ code: SpanStatusCode.ERROR });
+  throw e;
+}
+```
+
+---
+
+## Production Checklist
+
+- [ ] Batch processing enabled
+- [ ] **Shutdown handling:** Call `provider.shutdown()` before exit to flush queued spans
+- [ ] **Graceful termination:** Flush spans on SIGTERM/SIGINT signals
+- [ ] Data masking configured (`HIDE_INPUTS`/`HIDE_OUTPUTS` if PII)
+- [ ] Span filtering for health checks/noisy paths
+- [ ] Error handling implemented
+- [ ] Graceful degradation if Phoenix unavailable
+- [ ] Performance tested
+- [ ] Monitoring configured (Phoenix UI checked)
--- a/plugins/phoenix/skills/phoenix-tracing/references/projects-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/projects-python.md
@@ -0,0 +1,73 @@
+# Phoenix Tracing: Projects (Python)
+
+**Organize traces by application using projects (Phoenix's top-level grouping).**
+
+## Overview
+
+Projects group traces for a single application or experiment.
+
+**Use for:** Environments (dev/staging/prod), A/B testing, versioning
+
+## Setup
+
+### Environment Variable (Recommended)
+
+```bash
+export PHOENIX_PROJECT_NAME="my-app-prod"
+```
+
+```python
+import os
+os.environ["PHOENIX_PROJECT_NAME"] = "my-app-prod"
+from phoenix.otel import register
+register()  # Uses "my-app-prod"
+```
+
+### Code
+
+```python
+from phoenix.otel import register
+register(project_name="my-app-prod")
+```
+
+## Use Cases
+
+**Environments:**
+
+```python
+# Dev, staging, prod
+register(project_name="my-app-dev")
+register(project_name="my-app-staging")
+register(project_name="my-app-prod")
+```
+
+**A/B Testing:**
+
+```python
+# Compare models
+register(project_name="chatbot-gpt4")
+register(project_name="chatbot-claude")
+```
+
+**Versioning:**
+
+```python
+# Track versions
+register(project_name="my-app-v1")
+register(project_name="my-app-v2")
+```
+
+## Switching Projects (Python Notebooks Only)
+
+```python
+from openinference.instrumentation import dangerously_using_project
+from phoenix.otel import register
+
+register(project_name="my-app")
+
+# Switch temporarily for evals
+with dangerously_using_project("my-eval-project"):
+    run_evaluations()
+```
+
+**⚠️ Only use in notebooks/scripts, not production.**
--- a/plugins/phoenix/skills/phoenix-tracing/references/projects-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/projects-typescript.md
@@ -0,0 +1,54 @@
+# Phoenix Tracing: Projects (TypeScript)
+
+**Organize traces by application using projects (Phoenix's top-level grouping).**
+
+## Overview
+
+Projects group traces for a single application or experiment.
+
+**Use for:** Environments (dev/staging/prod), A/B testing, versioning
+
+## Setup
+
+### Environment Variable (Recommended)
+
+```bash
+export PHOENIX_PROJECT_NAME="my-app-prod"
+```
+
+```typescript
+process.env.PHOENIX_PROJECT_NAME = "my-app-prod";
+import { register } from "@arizeai/phoenix-otel";
+register();  // Uses "my-app-prod"
+```
+
+### Code
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+register({ projectName: "my-app-prod" });
+```
+
+## Use Cases
+
+**Environments:**
+```typescript
+// Dev, staging, prod
+register({ projectName: "my-app-dev" });
+register({ projectName: "my-app-staging" });
+register({ projectName: "my-app-prod" });
+```
+
+**A/B Testing:**
+```typescript
+// Compare models
+register({ projectName: "chatbot-gpt4" });
+register({ projectName: "chatbot-claude" });
+```
+
+**Versioning:**
+```typescript
+// Track versions
+register({ projectName: "my-app-v1" });
+register({ projectName: "my-app-v2" });
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/sessions-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/sessions-python.md
@@ -0,0 +1,104 @@
+# Sessions (Python)
+
+Track multi-turn conversations by grouping traces with session IDs.
+
+## Setup
+
+```python
+from openinference.instrumentation import using_session
+
+with using_session(session_id="user_123_conv_456"):
+    response = llm.invoke(prompt)
+```
+
+## Best Practices
+
+**Bad: Only parent span gets session ID**
+
+```python
+from openinference.semconv.trace import SpanAttributes
+from opentelemetry import trace
+
+span = trace.get_current_span()
+span.set_attribute(SpanAttributes.SESSION_ID, session_id)
+response = client.chat.completions.create(...)
+```
+
+**Good: All child spans inherit session ID**
+
+```python
+with using_session(session_id):
+    response = client.chat.completions.create(...)
+    result = my_custom_function()
+```
+
+**Why:** `using_session()` propagates session ID to all nested spans automatically.
+
+## Session ID Patterns
+
+```python
+import uuid
+
+session_id = str(uuid.uuid4())
+session_id = f"user_{user_id}_conv_{conversation_id}"
+session_id = f"debug_{timestamp}"
+```
+
+Good: `str(uuid.uuid4())`, `"user_123_conv_456"`
+Bad: `"session_1"`, `"test"`, empty string
+
+## Multi-Turn Chatbot Example
+
+```python
+import uuid
+from openinference.instrumentation import using_session
+
+session_id = str(uuid.uuid4())
+messages = []
+
+def send_message(user_input: str) -> str:
+    messages.append({"role": "user", "content": user_input})
+
+    with using_session(session_id):
+        response = client.chat.completions.create(
+            model="gpt-4",
+            messages=messages
+        )
+
+    assistant_message = response.choices[0].message.content
+    messages.append({"role": "assistant", "content": assistant_message})
+    return assistant_message
+```
+
+## Additional Attributes
+
+```python
+from openinference.instrumentation import using_attributes
+
+with using_attributes(
+    user_id="user_123",
+    session_id="conv_456",
+    metadata={"tier": "premium", "region": "us-west"}
+):
+    response = llm.invoke(prompt)
+```
+
+## LangChain Integration
+
+LangChain threads are automatically recognized as sessions:
+
+```python
+from langchain.chat_models import ChatOpenAI
+
+response = llm.invoke(
+    [HumanMessage(content="Hi!")],
+    config={"metadata": {"thread_id": "user_123_thread"}}
+)
+```
+
+Phoenix recognizes: `thread_id`, `session_id`, `conversation_id`
+
+## See Also
+
+- **TypeScript sessions:** `sessions-typescript.md`
+- **Session docs:** https://docs.arize.com/phoenix/tracing/sessions
--- a/plugins/phoenix/skills/phoenix-tracing/references/sessions-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/sessions-typescript.md
@@ -0,0 +1,199 @@
+# Sessions (TypeScript)
+
+Track multi-turn conversations by grouping traces with session IDs. **Use `withSpan` directly from `@arizeai/openinference-core`** - no wrappers or custom utilities needed.
+
+## Core Concept
+
+**Session Pattern:**
+1. Generate a unique `session.id` once at application startup
+2. Export SESSION_ID, import `withSpan` where needed
+3. Use `withSpan` to create a parent CHAIN span with `session.id` for each interaction
+4. All child spans (LLM, TOOL, AGENT, etc.) automatically group under the parent
+5. Query traces by `session.id` in Phoenix to see all interactions
+
+## Implementation (Best Practice)
+
+### 1. Setup (instrumentation.ts)
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+import { randomUUID } from "node:crypto";
+
+// Initialize Phoenix
+register({
+  projectName: "your-app",
+  url: process.env.PHOENIX_COLLECTOR_ENDPOINT || "http://localhost:6006",
+  apiKey: process.env.PHOENIX_API_KEY,
+  batch: true,
+});
+
+// Generate and export session ID
+export const SESSION_ID = randomUUID();
+```
+
+### 2. Usage (app code)
+
+```typescript
+import { withSpan } from "@arizeai/openinference-core";
+import { SESSION_ID } from "./instrumentation";
+
+// Use withSpan directly - no wrapper needed
+const handleInteraction = withSpan(
+  async () => {
+    const result = await agent.generate({ prompt: userInput });
+    return result;
+  },
+  {
+    name: "cli.interaction",
+    kind: "CHAIN",
+    attributes: { "session.id": SESSION_ID },
+  }
+);
+
+// Call it
+const result = await handleInteraction();
+```
+
+### With Input Parameters
+
+```typescript
+const processQuery = withSpan(
+  async (query: string) => {
+    return await agent.generate({ prompt: query });
+  },
+  {
+    name: "process.query",
+    kind: "CHAIN",
+    attributes: { "session.id": SESSION_ID },
+  }
+);
+
+await processQuery("What is 2+2?");
+```
+
+## Key Points
+
+### Session ID Scope
+- **CLI/Desktop Apps**: Generate once at process startup
+- **Web Servers**: Generate per-user session (e.g., on login, store in session storage)
+- **Stateless APIs**: Accept session.id as a parameter from client
+
+### Span Hierarchy
+```
+cli.interaction (CHAIN) ← session.id here
+├── ai.generateText (AGENT)
+│   ├── ai.generateText.doGenerate (LLM)
+│   └── ai.toolCall (TOOL)
+└── ai.generateText.doGenerate (LLM)
+```
+
+The `session.id` is only set on the **root span**. Child spans are automatically grouped by the trace hierarchy.
+
+### Querying Sessions
+
+```bash
+# Get all traces for a session
+npx @arizeai/phoenix-cli traces \
+  --endpoint http://localhost:6006 \
+  --project your-app \
+  --format raw \
+  --no-progress | \
+  jq '.[] | select(.spans[0].attributes["session.id"] == "YOUR-SESSION-ID")'
+```
+
+## Dependencies
+
+```json
+{
+  "dependencies": {
+    "@arizeai/openinference-core": "^2.0.5",
+    "@arizeai/phoenix-otel": "^0.4.1"
+  }
+}
+```
+
+**Note:** `@opentelemetry/api` is NOT needed - it's only for manual span management.
+
+## Why This Pattern?
+
+1. **Simple**: Just export SESSION_ID, use withSpan directly - no wrappers
+2. **Built-in**: `withSpan` from `@arizeai/openinference-core` handles everything
+3. **Type-safe**: Preserves function signatures and type information
+4. **Automatic lifecycle**: Handles span creation, error tracking, and cleanup
+5. **Framework-agnostic**: Works with any LLM framework (AI SDK, LangChain, etc.)
+6. **No extra deps**: Don't need `@opentelemetry/api` or custom utilities
+
+## Adding More Attributes
+
+```typescript
+import { withSpan } from "@arizeai/openinference-core";
+import { SESSION_ID } from "./instrumentation";
+
+const handleWithContext = withSpan(
+  async (userInput: string) => {
+    return await agent.generate({ prompt: userInput });
+  },
+  {
+    name: "cli.interaction",
+    kind: "CHAIN",
+    attributes: {
+      "session.id": SESSION_ID,
+      "user.id": userId,              // Track user
+      "metadata.environment": "prod",  // Custom metadata
+    },
+  }
+);
+```
+
+## Anti-Pattern: Don't Create Wrappers
+
+❌ **Don't do this:**
+```typescript
+// Unnecessary wrapper
+export function withSessionTracking(fn) {
+  return withSpan(fn, { attributes: { "session.id": SESSION_ID } });
+}
+```
+
+✅ **Do this instead:**
+```typescript
+// Use withSpan directly
+import { withSpan } from "@arizeai/openinference-core";
+import { SESSION_ID } from "./instrumentation";
+
+const handler = withSpan(fn, {
+  attributes: { "session.id": SESSION_ID }
+});
+```
+
+## Alternative: Context API Pattern
+
+For web servers or complex async flows where you need to propagate session IDs through middleware, you can use the Context API:
+
+```typescript
+import { context } from "@opentelemetry/api";
+import { setSession } from "@arizeai/openinference-core";
+
+await context.with(
+  setSession(context.active(), { sessionId: "user_123_conv_456" }),
+  async () => {
+    const response = await llm.invoke(prompt);
+  }
+);
+```
+
+**Use Context API when:**
+- Building web servers with middleware chains
+- Session ID needs to flow through many async boundaries
+- You don't control the call stack (e.g., framework-provided handlers)
+
+**Use withSpan when:**
+- Building CLI apps or scripts
+- You control the function call points
+- Simpler, more explicit code is preferred
+
+## Related
+
+- `fundamentals-universal-attributes.md` - Other universal attributes (user.id, metadata)
+- `span-chain.md` - CHAIN span specification
+- `sessions-python.md` - Python session tracking patterns
--- a/plugins/phoenix/skills/phoenix-tracing/references/setup-python.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/setup-python.md
@@ -0,0 +1,131 @@
+# Phoenix Tracing: Python Setup
+
+**Setup Phoenix tracing in Python with `arize-phoenix-otel`.**
+
+## Metadata
+
+| Attribute  | Value                               |
+| ---------- | ----------------------------------- |
+| Priority   | Critical - required for all tracing |
+| Setup Time | <5 min                              |
+
+## Quick Start (3 lines)
+
+```python
+from phoenix.otel import register
+register(project_name="my-app", auto_instrument=True)
+```
+
+**Connects to `http://localhost:6006`, auto-instruments all supported libraries.**
+
+## Installation
+
+```bash
+pip install arize-phoenix-otel
+```
+
+**Supported:** Python 3.10-3.13
+
+## Configuration
+
+### Environment Variables (Recommended)
+
+```bash
+export PHOENIX_API_KEY="your-api-key"  # Required for Phoenix Cloud
+export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"  # Or Cloud URL
+export PHOENIX_PROJECT_NAME="my-app"  # Optional
+```
+
+### Python Code
+
+```python
+from phoenix.otel import register
+
+tracer_provider = register(
+    project_name="my-app",              # Project name
+    endpoint="http://localhost:6006",   # Phoenix endpoint
+    auto_instrument=True,               # Auto-instrument supported libs
+    batch=True,                         # Batch processing (default: True)
+)
+```
+
+**Parameters:**
+
+- `project_name`: Project name (overrides `PHOENIX_PROJECT_NAME`)
+- `endpoint`: Phoenix URL (overrides `PHOENIX_COLLECTOR_ENDPOINT`)
+- `auto_instrument`: Enable auto-instrumentation (default: False)
+- `batch`: Use BatchSpanProcessor (default: True, production-recommended)
+- `protocol`: `"http/protobuf"` (default) or `"grpc"`
+
+## Auto-Instrumentation
+
+Install instrumentors for your frameworks:
+
+```bash
+pip install openinference-instrumentation-openai      # OpenAI SDK
+pip install openinference-instrumentation-langchain   # LangChain
+pip install openinference-instrumentation-llama-index # LlamaIndex
+# ... install others as needed
+```
+
+Then enable auto-instrumentation:
+
+```python
+register(project_name="my-app", auto_instrument=True)
+```
+
+Phoenix discovers and instruments all installed OpenInference packages automatically.
+
+## Batch Processing (Production)
+
+Enabled by default. Configure via environment variables:
+
+```bash
+export OTEL_BSP_SCHEDULE_DELAY=5000           # Batch every 5s
+export OTEL_BSP_MAX_QUEUE_SIZE=2048           # Queue 2048 spans
+export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=512     # Send 512 spans/batch
+```
+
+**Link:** https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
+
+## Verification
+
+1. Open Phoenix UI: `http://localhost:6006`
+2. Navigate to your project
+3. Run your application
+4. Check for traces (appear within batch delay)
+
+## Troubleshooting
+
+**No traces:**
+
+- Verify `PHOENIX_COLLECTOR_ENDPOINT` matches Phoenix server
+- Set `PHOENIX_API_KEY` for Phoenix Cloud
+- Confirm instrumentors installed
+
+**Missing attributes:**
+
+- Check span kind (see rules/ directory)
+- Verify attribute names (see rules/ directory)
+
+## Example
+
+```python
+from phoenix.otel import register
+from openai import OpenAI
+
+# Enable tracing with auto-instrumentation
+register(project_name="my-chatbot", auto_instrument=True)
+
+# OpenAI automatically instrumented
+client = OpenAI()
+response = client.chat.completions.create(
+    model="gpt-4",
+    messages=[{"role": "user", "content": "Hello!"}]
+)
+```
+
+## API Reference
+
+- [Python OTEL API Docs](https://arize-phoenix.readthedocs.io/projects/otel/en/latest/)
+- [Python Client API Docs](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
--- a/plugins/phoenix/skills/phoenix-tracing/references/setup-typescript.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/setup-typescript.md
@@ -0,0 +1,170 @@
+# TypeScript Setup
+
+Setup Phoenix tracing in TypeScript/JavaScript with `@arizeai/phoenix-otel`.
+
+## Metadata
+
+| Attribute | Value |
+|-----------|-------|
+| Priority | Critical - required for all tracing |
+| Setup Time | <5 min |
+
+## Quick Start
+
+```bash
+npm install @arizeai/phoenix-otel
+```
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+register({ projectName: "my-app" });
+```
+
+Connects to `http://localhost:6006` by default.
+
+## Configuration
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+
+register({
+  projectName: "my-app",
+  url: "http://localhost:6006",
+  apiKey: process.env.PHOENIX_API_KEY,
+  batch: true
+});
+```
+
+**Environment variables:**
+
+```bash
+export PHOENIX_API_KEY="your-api-key"
+export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"
+export PHOENIX_PROJECT_NAME="my-app"
+```
+
+## ESM vs CommonJS
+
+**CommonJS (automatic):**
+
+```javascript
+const { register } = require("@arizeai/phoenix-otel");
+register({ projectName: "my-app" });
+
+const OpenAI = require("openai");
+```
+
+**ESM (manual instrumentation required):**
+
+```typescript
+import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
+import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
+import OpenAI from "openai";
+
+register({ projectName: "my-app" });
+
+const instrumentation = new OpenAIInstrumentation();
+instrumentation.manuallyInstrument(OpenAI);
+registerInstrumentations({ instrumentations: [instrumentation] });
+```
+
+**Why:** ESM imports are hoisted, so `manuallyInstrument()` is needed.
+
+## Framework Integration
+
+**Next.js (App Router):**
+
+```typescript
+// instrumentation.ts
+export async function register() {
+  if (process.env.NEXT_RUNTIME === "nodejs") {
+    const { register } = await import("@arizeai/phoenix-otel");
+    register({ projectName: "my-nextjs-app" });
+  }
+}
+```
+
+**Express.js:**
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+
+register({ projectName: "my-express-app" });
+
+const app = express();
+```
+
+## Flushing Spans Before Exit
+
+**CRITICAL:** Spans may not be exported if still queued in the processor when your process exits. Call `provider.shutdown()` to explicitly flush before exit.
+
+**Standard pattern:**
+
+```typescript
+const provider = register({
+  projectName: "my-app",
+  batch: true,
+});
+
+async function main() {
+  await doWork();
+  await provider.shutdown();  // Flush spans before exit
+}
+
+main().catch(async (error) => {
+  console.error(error);
+  await provider.shutdown();  // Flush on error too
+  process.exit(1);
+});
+```
+
+**Alternative:**
+
+```typescript
+// Use batch: false for immediate export (no shutdown needed)
+register({
+  projectName: "my-app",
+  batch: false,
+});
+```
+
+For production patterns including graceful termination, see `production-typescript.md`.
+
+## Verification
+
+1. Open Phoenix UI: `http://localhost:6006`
+2. Run your application
+3. Check for traces in your project
+
+**Enable diagnostic logging:**
+
+```typescript
+import { DiagLogLevel, register } from "@arizeai/phoenix-otel";
+
+register({
+  projectName: "my-app",
+  diagLogLevel: DiagLogLevel.DEBUG,
+});
+```
+
+## Troubleshooting
+
+**No traces:**
+- Verify `PHOENIX_COLLECTOR_ENDPOINT` is correct
+- Set `PHOENIX_API_KEY` for Phoenix Cloud
+- For ESM: Ensure `manuallyInstrument()` is called
+- **With `batch: true`:** Call `provider.shutdown()` before exit to flush queued spans (see Flushing Spans section)
+
+**Traces missing:**
+- With `batch: true`: Call `await provider.shutdown()` before process exit to flush queued spans
+- Alternative: Set `batch: false` for immediate export (no shutdown needed)
+
+**Missing attributes:**
+- Check instrumentation is registered (ESM requires manual setup)
+- See `instrumentation-auto-typescript.md`
+
+## See Also
+
+- **Auto-instrumentation:** `instrumentation-auto-typescript.md`
+- **Manual instrumentation:** `instrumentation-manual-typescript.md`
+- **API docs:** https://arize-ai.github.io/phoenix/
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-agent.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-agent.md
@@ -0,0 +1,15 @@
+# AGENT Spans
+
+AGENT spans represent autonomous reasoning blocks (ReAct agents, planning loops, multi-step decision making).
+
+**Required:** `openinference.span.kind` = "AGENT"
+
+## Example
+
+```json
+{
+  "openinference.span.kind": "AGENT",
+  "input.value": "Book a flight to New York for next Monday",
+  "output.value": "I've booked flight AA123 departing Monday at 9:00 AM"
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-chain.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-chain.md
@@ -0,0 +1,43 @@
+# CHAIN Spans
+
+## Purpose
+
+CHAIN spans represent orchestration layers in your application (LangChain chains, custom workflows, application entry points). Often used as root spans.
+
+## Required Attributes
+
+| Attribute                 | Type   | Description     | Required |
+| ------------------------- | ------ | --------------- | -------- |
+| `openinference.span.kind` | String | Must be "CHAIN" | Yes      |
+
+## Common Attributes
+
+CHAIN spans typically use [Universal Attributes](fundamentals-universal-attributes.md):
+
+- `input.value` - Input to the chain (user query, request payload)
+- `output.value` - Output from the chain (final response)
+- `input.mime_type` / `output.mime_type` - Format indicators
+
+## Example: Root Chain
+
+```json
+{
+  "openinference.span.kind": "CHAIN",
+  "input.value": "{\"question\": \"What is the capital of France?\"}",
+  "input.mime_type": "application/json",
+  "output.value": "{\"answer\": \"The capital of France is Paris.\", \"sources\": [\"doc_123\"]}",
+  "output.mime_type": "application/json",
+  "session.id": "session_abc123",
+  "user.id": "user_xyz789"
+}
+```
+
+## Example: Nested Sub-Chain
+
+```json
+{
+  "openinference.span.kind": "CHAIN",
+  "input.value": "Summarize this document: ...",
+  "output.value": "This document discusses..."
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-embedding.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-embedding.md
@@ -0,0 +1,91 @@
+# EMBEDDING Spans
+
+## Purpose
+
+EMBEDDING spans represent vector generation operations (text-to-vector conversion for semantic search).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "EMBEDDING" | Yes |
+| `embedding.model_name` | String | Embedding model identifier | Recommended |
+
+## Attribute Reference
+
+### Single Embedding
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `embedding.model_name` | String | Embedding model identifier |
+| `embedding.text` | String | Input text to embed |
+| `embedding.vector` | String (JSON array) | Generated embedding vector |
+
+**Example:**
+```json
+{
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.text": "What is machine learning?",
+  "embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]"
+}
+```
+
+### Batch Embeddings
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `embedding.embeddings.{i}.embedding.text` | String | Text at index i |
+| `embedding.embeddings.{i}.embedding.vector` | String (JSON array) | Vector at index i |
+
+**Example:**
+```json
+{
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.embeddings.0.embedding.text": "First document",
+  "embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3, ..., 0.5]",
+  "embedding.embeddings.1.embedding.text": "Second document",
+  "embedding.embeddings.1.embedding.vector": "[0.6, 0.7, 0.8, ..., 0.9]"
+}
+```
+
+### Vector Format
+
+Vectors stored as JSON array strings:
+- Dimensions: Typically 384, 768, 1536, or 3072
+- Format: `"[0.123, -0.456, 0.789, ...]"`
+- Precision: Usually 3-6 decimal places
+
+**Storage Considerations:**
+- Large vectors can significantly increase trace size
+- Consider omitting vectors in production (keep `embedding.text` for debugging)
+- Use separate vector database for actual similarity search
+
+## Examples
+
+### Single Embedding
+
+```json
+{
+  "openinference.span.kind": "EMBEDDING",
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.text": "What is machine learning?",
+  "embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]",
+  "input.value": "What is machine learning?",
+  "output.value": "[0.023, -0.012, 0.045, ..., 0.001]"
+}
+```
+
+### Batch Embeddings
+
+```json
+{
+  "openinference.span.kind": "EMBEDDING",
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.embeddings.0.embedding.text": "First document",
+  "embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3]",
+  "embedding.embeddings.1.embedding.text": "Second document",
+  "embedding.embeddings.1.embedding.vector": "[0.4, 0.5, 0.6]",
+  "embedding.embeddings.2.embedding.text": "Third document",
+  "embedding.embeddings.2.embedding.vector": "[0.7, 0.8, 0.9]"
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-evaluator.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-evaluator.md
@@ -0,0 +1,51 @@
+# EVALUATOR Spans
+
+## Purpose
+
+EVALUATOR spans represent quality assessment operations (answer relevance, faithfulness, hallucination detection).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "EVALUATOR" | Yes |
+
+## Common Attributes
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Content being evaluated |
+| `output.value` | String | Evaluation result (score, label, explanation) |
+| `metadata.evaluator_name` | String | Evaluator identifier |
+| `metadata.score` | Float | Numeric score (0-1) |
+| `metadata.label` | String | Categorical label (relevant/irrelevant) |
+
+## Example: Answer Relevance
+
+```json
+{
+  "openinference.span.kind": "EVALUATOR",
+  "input.value": "{\"question\": \"What is the capital of France?\", \"answer\": \"The capital of France is Paris.\"}",
+  "input.mime_type": "application/json",
+  "output.value": "0.95",
+  "metadata.evaluator_name": "answer_relevance",
+  "metadata.score": 0.95,
+  "metadata.label": "relevant",
+  "metadata.explanation": "Answer directly addresses the question with correct information"
+}
+```
+
+## Example: Faithfulness Check
+
+```json
+{
+  "openinference.span.kind": "EVALUATOR",
+  "input.value": "{\"context\": \"Paris is in France.\", \"answer\": \"Paris is the capital of France.\"}",
+  "input.mime_type": "application/json",
+  "output.value": "0.5",
+  "metadata.evaluator_name": "faithfulness",
+  "metadata.score": 0.5,
+  "metadata.label": "partially_faithful",
+  "metadata.explanation": "Answer makes unsupported claim about Paris being the capital"
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-guardrail.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-guardrail.md
@@ -0,0 +1,49 @@
+# GUARDRAIL Spans
+
+## Purpose
+
+GUARDRAIL spans represent safety and policy checks (content moderation, PII detection, toxicity scoring).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "GUARDRAIL" | Yes |
+
+## Common Attributes
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Content being checked |
+| `output.value` | String | Guardrail result (allowed/blocked/flagged) |
+| `metadata.guardrail_type` | String | Type of check (toxicity, pii, bias) |
+| `metadata.score` | Float | Safety score (0-1) |
+| `metadata.threshold` | Float | Threshold for blocking |
+
+## Example: Content Moderation
+
+```json
+{
+  "openinference.span.kind": "GUARDRAIL",
+  "input.value": "User message: I want to build a bomb",
+  "output.value": "BLOCKED",
+  "metadata.guardrail_type": "content_moderation",
+  "metadata.score": 0.95,
+  "metadata.threshold": 0.7,
+  "metadata.categories": "[\"violence\", \"weapons\"]",
+  "metadata.action": "block_and_log"
+}
+```
+
+## Example: PII Detection
+
+```json
+{
+  "openinference.span.kind": "GUARDRAIL",
+  "input.value": "My SSN is 123-45-6789",
+  "output.value": "FLAGGED",
+  "metadata.guardrail_type": "pii_detection",
+  "metadata.detected_pii": "[\"ssn\"]",
+  "metadata.redacted_output": "My SSN is [REDACTED]"
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-llm.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-llm.md
@@ -0,0 +1,79 @@
+# LLM Spans
+
+Represent calls to language models (OpenAI, Anthropic, local models, etc.).
+
+## Required Attributes
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `openinference.span.kind` | String | Must be "LLM" |
+| `llm.model_name` | String | Model identifier (e.g., "gpt-4", "claude-3-5-sonnet-20241022") |
+
+## Key Attributes
+
+| Category | Attributes | Example |
+|----------|------------|---------|
+| **Model** | `llm.model_name`, `llm.provider` | "gpt-4-turbo", "openai" |
+| **Tokens** | `llm.token_count.prompt`, `llm.token_count.completion`, `llm.token_count.total` | 25, 8, 33 |
+| **Cost** | `llm.cost.prompt`, `llm.cost.completion`, `llm.cost.total` | 0.0021, 0.0045, 0.0066 |
+| **Parameters** | `llm.invocation_parameters` (JSON) | `{"temperature": 0.7, "max_tokens": 1024}` |
+| **Messages** | `llm.input_messages.{i}.*`, `llm.output_messages.{i}.*` | See examples below |
+| **Tools** | `llm.tools.{i}.tool.json_schema` | Function definitions |
+
+## Cost Tracking
+
+**Core attributes:**
+- `llm.cost.prompt` - Total input cost (USD)
+- `llm.cost.completion` - Total output cost (USD)
+- `llm.cost.total` - Total cost (USD)
+
+**Detailed cost breakdown:**
+- `llm.cost.prompt_details.{input,cache_read,cache_write,audio}` - Input cost components
+- `llm.cost.completion_details.{output,reasoning,audio}` - Output cost components
+
+## Messages
+
+**Input messages:**
+- `llm.input_messages.{i}.message.role` - "user", "assistant", "system", "tool"
+- `llm.input_messages.{i}.message.content` - Text content
+- `llm.input_messages.{i}.message.contents.{j}` - Multimodal (text + images)
+- `llm.input_messages.{i}.message.tool_calls` - Tool invocations
+
+**Output messages:** Same structure as input messages.
+
+## Example: Basic LLM Call
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "llm.model_name": "claude-3-5-sonnet-20241022",
+  "llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1024}",
+  "llm.input_messages.0.message.role": "system",
+  "llm.input_messages.0.message.content": "You are a helpful assistant.",
+  "llm.input_messages.1.message.role": "user",
+  "llm.input_messages.1.message.content": "What is the capital of France?",
+  "llm.output_messages.0.message.role": "assistant",
+  "llm.output_messages.0.message.content": "The capital of France is Paris.",
+  "llm.token_count.prompt": 25,
+  "llm.token_count.completion": 8,
+  "llm.token_count.total": 33
+}
+```
+
+## Example: LLM with Tool Calls
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "llm.model_name": "gpt-4-turbo",
+  "llm.input_messages.0.message.content": "What's the weather in SF?",
+  "llm.output_messages.0.message.tool_calls.0.tool_call.function.name": "get_weather",
+  "llm.output_messages.0.message.tool_calls.0.tool_call.function.arguments": "{\"location\": \"San Francisco\"}",
+  "llm.tools.0.tool.json_schema": "{\"type\": \"function\", \"function\": {\"name\": \"get_weather\"}}"
+}
+```
+
+## See Also
+
+- **Instrumentation:** `instrumentation-auto-python.md`, `instrumentation-manual-python.md`
+- **Full spec:** https://github.com/Arize-ai/openinference/blob/main/spec/semantic_conventions.md
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-reranker.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-reranker.md
@@ -0,0 +1,86 @@
+# RERANKER Spans
+
+## Purpose
+
+RERANKER spans represent reordering of retrieved documents (Cohere Rerank, cross-encoder models).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "RERANKER" | Yes |
+
+## Attribute Reference
+
+### Reranker Parameters
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `reranker.model_name` | String | Reranker model identifier |
+| `reranker.query` | String | Query used for reranking |
+| `reranker.top_k` | Integer | Number of documents to return |
+
+### Input Documents
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `reranker.input_documents.{i}.document.id` | String | Input document ID |
+| `reranker.input_documents.{i}.document.content` | String | Input document content |
+| `reranker.input_documents.{i}.document.score` | Float | Original retrieval score |
+| `reranker.input_documents.{i}.document.metadata` | String (JSON) | Document metadata |
+
+### Output Documents
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `reranker.output_documents.{i}.document.id` | String | Output document ID (reordered) |
+| `reranker.output_documents.{i}.document.content` | String | Output document content |
+| `reranker.output_documents.{i}.document.score` | Float | New reranker score |
+| `reranker.output_documents.{i}.document.metadata` | String (JSON) | Document metadata |
+
+### Score Comparison
+
+Input scores (from retriever) vs. output scores (from reranker):
+
+```json
+{
+  "reranker.input_documents.0.document.id": "doc_A",
+  "reranker.input_documents.0.document.score": 0.7,
+  "reranker.input_documents.1.document.id": "doc_B",
+  "reranker.input_documents.1.document.score": 0.9,
+  "reranker.output_documents.0.document.id": "doc_B",
+  "reranker.output_documents.0.document.score": 0.95,
+  "reranker.output_documents.1.document.id": "doc_A",
+  "reranker.output_documents.1.document.score": 0.85
+}
+```
+
+In this example:
+- Input: doc_B (0.9) ranked higher than doc_A (0.7)
+- Output: doc_B still highest but both scores increased
+- Reranker confirmed retriever's ordering but refined scores
+
+## Examples
+
+### Complete Reranking Example
+
+```json
+{
+  "openinference.span.kind": "RERANKER",
+  "reranker.model_name": "cohere-rerank-v2",
+  "reranker.query": "What is machine learning?",
+  "reranker.top_k": 2,
+  "reranker.input_documents.0.document.id": "doc_123",
+  "reranker.input_documents.0.document.content": "Machine learning is a subset...",
+  "reranker.input_documents.1.document.id": "doc_456",
+  "reranker.input_documents.1.document.content": "Supervised learning algorithms...",
+  "reranker.input_documents.2.document.id": "doc_789",
+  "reranker.input_documents.2.document.content": "Neural networks are...",
+  "reranker.output_documents.0.document.id": "doc_456",
+  "reranker.output_documents.0.document.content": "Supervised learning algorithms...",
+  "reranker.output_documents.0.document.score": 0.95,
+  "reranker.output_documents.1.document.id": "doc_123",
+  "reranker.output_documents.1.document.content": "Machine learning is a subset...",
+  "reranker.output_documents.1.document.score": 0.88
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-retriever.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-retriever.md
@@ -0,0 +1,110 @@
+# RETRIEVER Spans
+
+## Purpose
+
+RETRIEVER spans represent document/context retrieval operations (vector DB queries, semantic search, keyword search).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "RETRIEVER" | Yes |
+
+## Attribute Reference
+
+### Query
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Search query text |
+
+### Document Schema
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `retrieval.documents.{i}.document.id` | String | Unique document identifier |
+| `retrieval.documents.{i}.document.content` | String | Document text content |
+| `retrieval.documents.{i}.document.score` | Float | Relevance score (0-1 or distance) |
+| `retrieval.documents.{i}.document.metadata` | String (JSON) | Document metadata |
+
+### Flattening Pattern for Documents
+
+Documents are flattened using zero-indexed notation:
+
+```
+retrieval.documents.0.document.id
+retrieval.documents.0.document.content
+retrieval.documents.0.document.score
+retrieval.documents.1.document.id
+retrieval.documents.1.document.content
+retrieval.documents.1.document.score
+...
+```
+
+### Document Metadata
+
+Common metadata fields (stored as JSON string):
+
+```json
+{
+  "source": "knowledge_base.pdf",
+  "page": 42,
+  "section": "Introduction",
+  "author": "Jane Doe",
+  "created_at": "2024-01-15",
+  "url": "https://example.com/doc",
+  "chunk_id": "chunk_123"
+}
+```
+
+**Example with metadata:**
+```json
+{
+  "retrieval.documents.0.document.id": "doc_123",
+  "retrieval.documents.0.document.content": "Machine learning is a method of data analysis...",
+  "retrieval.documents.0.document.score": 0.92,
+  "retrieval.documents.0.document.metadata": "{\"source\": \"ml_textbook.pdf\", \"page\": 15, \"chapter\": \"Introduction\"}"
+}
+```
+
+### Ordering
+
+Documents are ordered by index (0, 1, 2, ...). Typically:
+- Index 0 = highest scoring document
+- Index 1 = second highest
+- etc.
+
+Preserve retrieval order in your flattened attributes.
+
+### Large Document Handling
+
+For very long documents:
+- Consider truncating `document.content` to first N characters
+- Store full content in separate document store
+- Use `document.id` to reference full content
+
+## Examples
+
+### Basic Vector Search
+
+```json
+{
+  "openinference.span.kind": "RETRIEVER",
+  "input.value": "What is machine learning?",
+  "retrieval.documents.0.document.id": "doc_123",
+  "retrieval.documents.0.document.content": "Machine learning is a subset of artificial intelligence...",
+  "retrieval.documents.0.document.score": 0.92,
+  "retrieval.documents.0.document.metadata": "{\"source\": \"textbook.pdf\", \"page\": 42}",
+  "retrieval.documents.1.document.id": "doc_456",
+  "retrieval.documents.1.document.content": "Machine learning algorithms learn patterns from data...",
+  "retrieval.documents.1.document.score": 0.87,
+  "retrieval.documents.1.document.metadata": "{\"source\": \"article.html\", \"author\": \"Jane Doe\"}",
+  "retrieval.documents.2.document.id": "doc_789",
+  "retrieval.documents.2.document.content": "Supervised learning is a type of machine learning...",
+  "retrieval.documents.2.document.score": 0.81,
+  "retrieval.documents.2.document.metadata": "{\"source\": \"wiki.org\"}",
+  "metadata.retriever_type": "vector_search",
+  "metadata.vector_db": "pinecone",
+  "metadata.top_k": 3
+}
+```
--- a/plugins/phoenix/skills/phoenix-tracing/references/span-tool.md
+++ b/plugins/phoenix/skills/phoenix-tracing/references/span-tool.md
@@ -0,0 +1,67 @@
+# TOOL Spans
+
+## Purpose
+
+TOOL spans represent external tool or function invocations (API calls, database queries, calculators, custom functions).
+
+## Required Attributes
+
+| Attribute                 | Type   | Description        | Required    |
+| ------------------------- | ------ | ------------------ | ----------- |
+| `openinference.span.kind` | String | Must be "TOOL"     | Yes         |
+| `tool.name`               | String | Tool/function name | Recommended |
+
+## Attribute Reference
+
+### Tool Execution Attributes
+
+| Attribute          | Type          | Description                                |
+| ------------------ | ------------- | ------------------------------------------ |
+| `tool.name`        | String        | Tool/function name                         |
+| `tool.description` | String        | Tool purpose/description                   |
+| `tool.parameters`  | String (JSON) | JSON schema defining the tool's parameters |
+| `input.value`      | String (JSON) | Actual input values passed to the tool     |
+| `output.value`     | String        | Tool output/result                         |
+| `output.mime_type` | String        | Result content type (e.g., "application/json") |
+
+## Examples
+
+### API Call Tool
+
+```json
+{
+  "openinference.span.kind": "TOOL",
+  "tool.name": "get_weather",
+  "tool.description": "Fetches current weather for a location",
+  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"location\": {\"type\": \"string\"}, \"units\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]}}, \"required\": [\"location\"]}",
+  "input.value": "{\"location\": \"San Francisco\", \"units\": \"celsius\"}",
+  "output.value": "{\"temperature\": 18, \"conditions\": \"partly cloudy\"}"
+}
+```
+
+### Calculator Tool
+
+```json
+{
+  "openinference.span.kind": "TOOL",
+  "tool.name": "calculator",
+  "tool.description": "Performs mathematical calculations",
+  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"expression\": {\"type\": \"string\", \"description\": \"Math expression to evaluate\"}}, \"required\": [\"expression\"]}",
+  "input.value": "{\"expression\": \"2 + 2\"}",
+  "output.value": "4"
+}
+```
+
+### Database Query Tool
+
+```json
+{
+  "openinference.span.kind": "TOOL",
+  "tool.name": "sql_query",
+  "tool.description": "Executes SQL query on user database",
+  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"query\": {\"type\": \"string\", \"description\": \"SQL query to execute\"}}, \"required\": [\"query\"]}",
+  "input.value": "{\"query\": \"SELECT * FROM users WHERE id = 123\"}",
+  "output.value": "[{\"id\": 123, \"name\": \"Alice\", \"email\": \"alice@example.com\"}]",
+  "output.mime_type": "application/json"
+}
+```