Add Arize and Phoenix LLM observability skills (#1204)

* Add 9 Arize LLM observability skills Add skills for Arize AI platform covering trace export, instrumentation, datasets, experiments, evaluators, AI provider integrations, annotations, prompt optimization, and deep linking to the Arize UI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add 3 Phoenix AI observability skills Add skills for Phoenix (Arize open-source) covering CLI debugging, LLM evaluation workflows, and OpenInference tracing/instrumentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Ignoring intentional bad spelling * Fix CI: remove .DS_Store from generated skills README and add codespell ignore Remove .DS_Store artifact from winmd-api-search asset listing in generated README.skills.md so it matches the CI Linux build output. Add queston to codespell ignore list (intentional misspelling example in arize-dataset skill). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add arize-ax and phoenix plugins Bundle the 9 Arize skills into an arize-ax plugin and the 3 Phoenix skills into a phoenix plugin for easier installation as single packages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix skill folder structures to match source repos Move arize supporting files from references/ to root level and rename phoenix references/ to rules/ to exactly match the original source repository folder structures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fixing file locations * Fixing readme --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-30 10:31:47 +00:00 · 2026-04-01 15:58:55 -07:00
parent 8c417aa139
commit d79183139a
101 changed files with 10448 additions and 1 deletions
@@ -0,0 +1,69 @@
+# Annotations Overview
+
+Annotations allow you to add human or automated feedback to traces, spans, documents, and sessions. Annotations are essential for evaluation, quality assessment, and building training datasets.
+
+## Annotation Types
+
+Phoenix supports four types of annotations:
+
+| Type                    | Target                           | Purpose                                  | Example Use Case                 |
+| ----------------------- | -------------------------------- | ---------------------------------------- | -------------------------------- |
+| **Span Annotation**     | Individual span                  | Feedback on a specific operation         | "This LLM response was accurate" |
+| **Document Annotation** | Document within a RETRIEVER span | Feedback on retrieved document relevance | "This document was not helpful"  |
+| **Trace Annotation**    | Entire trace                     | Feedback on end-to-end interaction       | "User was satisfied with result" |
+| **Session Annotation**  | User session                     | Feedback on multi-turn conversation      | "Session ended successfully"     |
+
+## Annotation Fields
+
+Every annotation has these fields:
+
+### Required Fields
+
+| Field     | Type   | Description                                                                   |
+| --------- | ------ | ----------------------------------------------------------------------------- |
+| Entity ID | String | ID of the target entity (span_id, trace_id, session_id, or document_position) |
+| `name`    | String | Annotation name/label (e.g., "quality", "relevance", "helpfulness")           |
+
+### Result Fields (At Least One Required)
+
+| Field         | Type              | Description                                                       |
+| ------------- | ----------------- | ----------------------------------------------------------------- |
+| `label`       | String (optional) | Categorical value (e.g., "good", "bad", "relevant", "irrelevant") |
+| `score`       | Float (optional)  | Numeric value (typically 0-1, but can be any range)               |
+| `explanation` | String (optional) | Free-text explanation of the annotation                           |
+
+**At least one** of `label`, `score`, or `explanation` must be provided.
+
+### Optional Fields
+
+| Field            | Type   | Description                                                                             |
+| ---------------- | ------ | --------------------------------------------------------------------------------------- |
+| `annotator_kind` | String | Who created this annotation: "HUMAN", "LLM", or "CODE" (default: "HUMAN")               |
+| `identifier`     | String | Unique identifier for upsert behavior (updates existing if same name+entity+identifier) |
+| `metadata`       | Object | Custom metadata as key-value pairs                                                      |
+
+## Annotator Kinds
+
+| Kind    | Description                    | Example                           |
+| ------- | ------------------------------ | --------------------------------- |
+| `HUMAN` | Manual feedback from a person  | User ratings, expert labels       |
+| `LLM`   | Automated feedback from an LLM | GPT-4 evaluating response quality |
+| `CODE`  | Automated feedback from code   | Rule-based checks, heuristics     |
+
+## Examples
+
+**Quality Assessment:**
+
+- `quality` - Overall quality (label: good/fair/poor, score: 0-1)
+- `correctness` - Factual accuracy (label: correct/incorrect, score: 0-1)
+- `helpfulness` - User satisfaction (label: helpful/not_helpful, score: 0-1)
+
+**RAG-Specific:**
+
+- `relevance` - Document relevance to query (label: relevant/irrelevant, score: 0-1)
+- `faithfulness` - Answer grounded in context (label: faithful/unfaithful, score: 0-1)
+
+**Safety:**
+
+- `toxicity` - Contains harmful content (score: 0-1)
+- `pii_detected` - Contains personally identifiable information (label: yes/no)
@@ -0,0 +1,114 @@
+# Python SDK Annotation Patterns
+
+Add feedback to spans, traces, documents, and sessions using the Python client.
+
+## Client Setup
+
+```python
+from phoenix.client import Client
+client = Client()  # Default: http://localhost:6006
+```
+
+## Span Annotations
+
+Add feedback to individual spans:
+
+```python
+client.spans.add_span_annotation(
+    span_id="abc123",
+    annotation_name="quality",
+    annotator_kind="HUMAN",
+    label="high_quality",
+    score=0.95,
+    explanation="Accurate and well-formatted",
+    metadata={"reviewer": "alice"},
+    sync=True
+)
+```
+
+## Document Annotations
+
+Rate individual documents in RETRIEVER spans:
+
+```python
+client.spans.add_document_annotation(
+    span_id="retriever_span",
+    document_position=0,  # 0-based index
+    annotation_name="relevance",
+    annotator_kind="LLM",
+    label="relevant",
+    score=0.95
+)
+```
+
+## Trace Annotations
+
+Feedback on entire traces:
+
+```python
+client.traces.add_trace_annotation(
+    trace_id="trace_abc",
+    annotation_name="correctness",
+    annotator_kind="HUMAN",
+    label="correct",
+    score=1.0
+)
+```
+
+## Session Annotations
+
+Feedback on multi-turn conversations:
+
+```python
+client.sessions.add_session_annotation(
+    session_id="session_xyz",
+    annotation_name="user_satisfaction",
+    annotator_kind="HUMAN",
+    label="satisfied",
+    score=0.85
+)
+```
+
+## RAG Pipeline Example
+
+```python
+from phoenix.client import Client
+from phoenix.client.resources.spans import SpanDocumentAnnotationData
+
+client = Client()
+
+# Document relevance (batch)
+client.spans.log_document_annotations(
+    document_annotations=[
+        SpanDocumentAnnotationData(
+            name="relevance", span_id="retriever_span", document_position=i,
+            annotator_kind="LLM", result={"label": label, "score": score}
+        )
+        for i, (label, score) in enumerate([
+            ("relevant", 0.95), ("relevant", 0.80), ("irrelevant", 0.10)
+        ])
+    ]
+)
+
+# LLM response quality
+client.spans.add_span_annotation(
+    span_id="llm_span",
+    annotation_name="faithfulness",
+    annotator_kind="LLM",
+    label="faithful",
+    score=0.90
+)
+
+# Overall trace quality
+client.traces.add_trace_annotation(
+    trace_id="trace_123",
+    annotation_name="correctness",
+    annotator_kind="HUMAN",
+    label="correct",
+    score=1.0
+)
+```
+
+## API Reference
+
+- [Python Client API](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
@@ -0,0 +1,137 @@
+# TypeScript SDK Annotation Patterns
+
+Add feedback to spans, traces, documents, and sessions using the TypeScript client.
+
+## Client Setup
+
+```typescript
+import { createClient } from "phoenix-client";
+const client = createClient();  // Default: http://localhost:6006
+```
+
+## Span Annotations
+
+Add feedback to individual spans:
+
+```typescript
+import { addSpanAnnotation } from "phoenix-client";
+
+await addSpanAnnotation({
+  client,
+  spanAnnotation: {
+    spanId: "abc123",
+    name: "quality",
+    annotatorKind: "HUMAN",
+    label: "high_quality",
+    score: 0.95,
+    explanation: "Accurate and well-formatted",
+    metadata: { reviewer: "alice" }
+  },
+  sync: true
+});
+```
+
+## Document Annotations
+
+Rate individual documents in RETRIEVER spans:
+
+```typescript
+import { addDocumentAnnotation } from "phoenix-client";
+
+await addDocumentAnnotation({
+  client,
+  documentAnnotation: {
+    spanId: "retriever_span",
+    documentPosition: 0,  // 0-based index
+    name: "relevance",
+    annotatorKind: "LLM",
+    label: "relevant",
+    score: 0.95
+  }
+});
+```
+
+## Trace Annotations
+
+Feedback on entire traces:
+
+```typescript
+import { addTraceAnnotation } from "phoenix-client";
+
+await addTraceAnnotation({
+  client,
+  traceAnnotation: {
+    traceId: "trace_abc",
+    name: "correctness",
+    annotatorKind: "HUMAN",
+    label: "correct",
+    score: 1.0
+  }
+});
+```
+
+## Session Annotations
+
+Feedback on multi-turn conversations:
+
+```typescript
+import { addSessionAnnotation } from "phoenix-client";
+
+await addSessionAnnotation({
+  client,
+  sessionAnnotation: {
+    sessionId: "session_xyz",
+    name: "user_satisfaction",
+    annotatorKind: "HUMAN",
+    label: "satisfied",
+    score: 0.85
+  }
+});
+```
+
+## RAG Pipeline Example
+
+```typescript
+import { createClient, logDocumentAnnotations, addSpanAnnotation, addTraceAnnotation } from "phoenix-client";
+
+const client = createClient();
+
+// Document relevance (batch)
+await logDocumentAnnotations({
+  client,
+  documentAnnotations: [
+    { spanId: "retriever_span", documentPosition: 0, name: "relevance",
+      annotatorKind: "LLM", label: "relevant", score: 0.95 },
+    { spanId: "retriever_span", documentPosition: 1, name: "relevance",
+      annotatorKind: "LLM", label: "relevant", score: 0.80 }
+  ]
+});
+
+// LLM response quality
+await addSpanAnnotation({
+  client,
+  spanAnnotation: {
+    spanId: "llm_span",
+    name: "faithfulness",
+    annotatorKind: "LLM",
+    label: "faithful",
+    score: 0.90
+  }
+});
+
+// Overall trace quality
+await addTraceAnnotation({
+  client,
+  traceAnnotation: {
+    traceId: "trace_123",
+    name: "correctness",
+    annotatorKind: "HUMAN",
+    label: "correct",
+    score: 1.0
+  }
+});
+```
+
+## API Reference
+
+- [TypeScript Client API](https://arize-ai.github.io/phoenix/)
@@ -0,0 +1,58 @@
+# Flattening Convention
+
+OpenInference flattens nested data structures into dot-notation attributes for database compatibility, OpenTelemetry compatibility, and simple querying.
+
+## Flattening Rules
+
+**Objects → Dot Notation**
+
+```javascript
+{ llm: { model_name: "gpt-4", token_count: { prompt: 10, completion: 20 } } }
+// becomes
+{ "llm.model_name": "gpt-4", "llm.token_count.prompt": 10, "llm.token_count.completion": 20 }
+```
+
+**Arrays → Zero-Indexed Notation**
+
+```javascript
+{ llm: { input_messages: [{ role: "user", content: "Hi" }] } }
+// becomes
+{ "llm.input_messages.0.message.role": "user", "llm.input_messages.0.message.content": "Hi" }
+```
+
+**Message Convention: `.message.` segment required**
+
+```
+llm.input_messages.{index}.message.{field}
+llm.input_messages.0.message.tool_calls.0.tool_call.function.name
+```
+
+## Complete Example
+
+```javascript
+// Original
+{
+  openinference: { span: { kind: "LLM" } },
+  llm: {
+    model_name: "claude-3-5-sonnet-20241022",
+    invocation_parameters: { temperature: 0.7, max_tokens: 1000 },
+    input_messages: [{ role: "user", content: "Tell me a joke" }],
+    output_messages: [{ role: "assistant", content: "Why did the chicken cross the road?" }],
+    token_count: { prompt: 5, completion: 10, total: 15 }
+  }
+}
+
+// Flattened (stored in Phoenix spans.attributes JSONB)
+{
+  "openinference.span.kind": "LLM",
+  "llm.model_name": "claude-3-5-sonnet-20241022",
+  "llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1000}",
+  "llm.input_messages.0.message.role": "user",
+  "llm.input_messages.0.message.content": "Tell me a joke",
+  "llm.output_messages.0.message.role": "assistant",
+  "llm.output_messages.0.message.content": "Why did the chicken cross the road?",
+  "llm.token_count.prompt": 5,
+  "llm.token_count.completion": 10,
+  "llm.token_count.total": 15
+}
+```
@@ -0,0 +1,53 @@
+# Overview and Traces & Spans
+
+This document covers the fundamental concepts of OpenInference traces and spans in Phoenix.
+
+## Overview
+
+OpenInference is a set of semantic conventions for AI and LLM applications based on OpenTelemetry. Phoenix uses these conventions to capture, store, and analyze traces from AI applications.
+
+**Key Concepts:**
+
+- **Traces** represent end-to-end requests through your application
+- **Spans** represent individual operations within a trace (LLM calls, retrievals, tool invocations)
+- **Attributes** are key-value pairs attached to spans using flattened, dot-notation paths
+- **Span Kinds** categorize the type of operation (LLM, RETRIEVER, TOOL, etc.)
+
+## Traces and Spans
+
+### Trace Hierarchy
+
+A **trace** is a tree of **spans** representing a complete request:
+
+```
+Trace ID: abc123
+├─ Span 1: CHAIN (root span, parent_id = null)
+│  ├─ Span 2: RETRIEVER (parent_id = span_1_id)
+│  │  └─ Span 3: EMBEDDING (parent_id = span_2_id)
+│  └─ Span 4: LLM (parent_id = span_1_id)
+│     └─ Span 5: TOOL (parent_id = span_4_id)
+```
+
+### Context Propagation
+
+Spans maintain parent-child relationships via:
+
+- `trace_id` - Same for all spans in a trace
+- `span_id` - Unique identifier for this span
+- `parent_id` - References parent span's `span_id` (null for root spans)
+
+Phoenix uses these relationships to:
+
+- Build the span tree visualization in the UI
+- Calculate cumulative metrics (tokens, errors) up the tree
+- Enable nested querying (e.g., "find CHAIN spans containing LLM spans with errors")
+
+### Span Lifecycle
+
+Each span has:
+
+- `start_time` - When the operation began (Unix timestamp in nanoseconds)
+- `end_time` - When the operation completed
+- `status_code` - OK, ERROR, or UNSET
+- `status_message` - Optional error message
+- `attributes` - object with all semantic convention attributes
@@ -0,0 +1,64 @@
+# Required and Recommended Attributes
+
+This document covers the required attribute and highly recommended attributes for all OpenInference spans.
+
+## Required Attribute
+
+**Every span MUST have exactly one required attribute:**
+
+```json
+{
+  "openinference.span.kind": "LLM"
+}
+```
+
+## Highly Recommended Attributes
+
+While not strictly required, these attributes are **highly recommended** on all spans as they:
+- Enable evaluation and quality assessment
+- Help understand information flow through your application
+- Make traces more useful for debugging
+
+### Input/Output Values
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Input to the operation (prompt, query, document) |
+| `output.value` | String | Output from the operation (response, result, answer) |
+
+**Example:**
+```json
+{
+  "openinference.span.kind": "LLM",
+  "input.value": "What is the capital of France?",
+  "output.value": "The capital of France is Paris."
+}
+```
+
+**Why these matter:**
+- **Evaluations**: Many evaluators (faithfulness, relevance, hallucination detection) require both input and output to assess quality
+- **Information flow**: Seeing inputs/outputs makes it easy to trace how data transforms through your application
+- **Debugging**: When something goes wrong, having the actual input/output makes root cause analysis much faster
+- **Analytics**: Enables pattern analysis across similar inputs or outputs
+
+**Phoenix Behavior:**
+- Input/output displayed prominently in span details
+- Evaluators can automatically access these values
+- Search/filter traces by input or output content
+- Export inputs/outputs for fine-tuning datasets
+
+## Valid Span Kinds
+
+There are exactly **9 valid span kinds** in OpenInference:
+
+| Span Kind | Purpose | Common Use Case |
+|-----------|---------|-----------------|
+| `LLM` | Language model inference | OpenAI, Anthropic, local LLM calls |
+| `EMBEDDING` | Vector generation | Text-to-vector conversion |
+| `CHAIN` | Application flow orchestration | LangChain chains, custom workflows |
+| `RETRIEVER` | Document/context retrieval | Vector DB queries, semantic search |
+| `RERANKER` | Result reordering | Rerank retrieved documents |
+| `TOOL` | External tool invocation | API calls, function execution |
+| `AGENT` | Autonomous reasoning | ReAct agents, planning loops |
+| `GUARDRAIL` | Safety/policy checks | Content moderation, PII detection |
+| `EVALUATOR` | Quality assessment | Answer relevance, faithfulness scoring |
@@ -0,0 +1,72 @@
+# Universal Attributes
+
+This document covers attributes that can be used on any span kind in OpenInference.
+
+## Overview
+
+These attributes can be used on **any span kind** to provide additional context, tracking, and metadata.
+
+## Input/Output
+
+| Attribute          | Type   | Description                                          |
+| ------------------ | ------ | ---------------------------------------------------- |
+| `input.value`      | String | Input to the operation (prompt, query, document)     |
+| `input.mime_type`  | String | MIME type (e.g., "text/plain", "application/json")   |
+| `output.value`     | String | Output from the operation (response, vector, result) |
+| `output.mime_type` | String | MIME type of output                                  |
+
+### Why Capture I/O?
+
+**Always capture input/output for evaluation-ready spans:**
+- Phoenix evaluators (faithfulness, relevance, Q&A correctness) require `input.value` and `output.value`
+- Phoenix UI displays I/O prominently in trace views for debugging
+- Enables exporting I/O for creating fine-tuning datasets
+- Provides complete context for analyzing agent behavior
+
+**Example attributes:**
+
+```json
+{
+  "openinference.span.kind": "CHAIN",
+  "input.value": "What is the weather?",
+  "input.mime_type": "text/plain",
+  "output.value": "I don't have access to weather data.",
+  "output.mime_type": "text/plain"
+}
+```
+
+**See language-specific implementation:**
+- TypeScript: `instrumentation-manual-typescript.md`
+- Python: `instrumentation-manual-python.md`
+
+## Session and User Tracking
+
+| Attribute    | Type   | Description                                    |
+| ------------ | ------ | ---------------------------------------------- |
+| `session.id` | String | Session identifier for grouping related traces |
+| `user.id`    | String | User identifier for per-user analysis          |
+
+**Example:**
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "session.id": "session_abc123",
+  "user.id": "user_xyz789"
+}
+```
+
+## Metadata
+
+| Attribute  | Type   | Description                                |
+| ---------- | ------ | ------------------------------------------ |
+| `metadata` | string | JSON-serialized object of key-value pairs  |
+
+**Example:**
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "metadata": "{\"environment\": \"production\", \"model_version\": \"v2.1\", \"cost_center\": \"engineering\"}"
+}
+```
@@ -0,0 +1,85 @@
+# Phoenix Tracing: Auto-Instrumentation (Python)
+
+**Automatically create spans for LLM calls without code changes.**
+
+## Overview
+
+Auto-instrumentation patches supported libraries at runtime to create spans automatically. Use for supported frameworks (LangChain, LlamaIndex, OpenAI SDK, etc.). For custom logic, manual-instrumentation-python.md.
+
+## Supported Frameworks
+
+**Python:**
+
+- LLM SDKs: OpenAI, Anthropic, Bedrock, Mistral, Vertex AI, Groq, Ollama
+- Frameworks: LangChain, LlamaIndex, DSPy, CrewAI, Instructor, Haystack
+- Install: `pip install openinference-instrumentation-{name}`
+
+## Setup
+
+**Install and enable:**
+
+```bash
+pip install arize-phoenix-otel
+pip install openinference-instrumentation-openai  # Add others as needed
+```
+
+```python
+from phoenix.otel import register
+
+register(project_name="my-app", auto_instrument=True)  # Discovers all installed instrumentors
+```
+
+**Example:**
+
+```python
+from phoenix.otel import register
+from openai import OpenAI
+
+register(project_name="my-app", auto_instrument=True)
+
+client = OpenAI()
+response = client.chat.completions.create(
+    model="gpt-4",
+    messages=[{"role": "user", "content": "Hello!"}]
+)
+```
+
+Traces appear in Phoenix UI with model, input/output, tokens, timing automatically captured. See span kind files for full attribute schemas.
+
+**Selective instrumentation** (explicit control):
+
+```python
+from phoenix.otel import register
+from openinference.instrumentation.openai import OpenAIInstrumentor
+
+tracer_provider = register(project_name="my-app")  # No auto_instrument
+OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)
+```
+
+## Limitations
+
+Auto-instrumentation does NOT capture:
+
+- Custom business logic
+- Internal function calls
+
+**Example:**
+
+```python
+def my_custom_workflow(query: str) -> str:
+    preprocessed = preprocess(query)  # Not traced
+    response = client.chat.completions.create(...)  # Traced (auto)
+    postprocessed = postprocess(response)  # Not traced
+    return postprocessed
+```
+
+**Solution:** Add manual instrumentation:
+
+```python
+@tracer.chain
+def my_custom_workflow(query: str) -> str:
+    preprocessed = preprocess(query)
+    response = client.chat.completions.create(...)
+    postprocessed = postprocess(response)
+    return postprocessed
+```
@@ -0,0 +1,87 @@
+# Auto-Instrumentation (TypeScript)
+
+Automatically create spans for LLM calls without code changes.
+
+## Supported Frameworks
+
+- **LLM SDKs:** OpenAI
+- **Frameworks:** LangChain
+- **Install:** `npm install @arizeai/openinference-instrumentation-{name}`
+
+## Setup
+
+**CommonJS (automatic):**
+
+```javascript
+const { register } = require("@arizeai/phoenix-otel");
+const OpenAI = require("openai");
+
+register({ projectName: "my-app" });
+
+const client = new OpenAI();
+```
+
+**ESM (manual required):**
+
+```typescript
+import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
+import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
+import OpenAI from "openai";
+
+register({ projectName: "my-app" });
+
+const instrumentation = new OpenAIInstrumentation();
+instrumentation.manuallyInstrument(OpenAI);
+registerInstrumentations({ instrumentations: [instrumentation] });
+```
+
+**Why:** ESM imports are hoisted before `register()` runs.
+
+## Limitations
+
+**What auto-instrumentation does NOT capture:**
+
+```typescript
+async function myWorkflow(query: string): Promise<string> {
+  const preprocessed = await preprocess(query);        // Not traced
+  const response = await client.chat.completions.create(...);  // Traced (auto)
+  const postprocessed = await postprocess(response);   // Not traced
+  return postprocessed;
+}
+```
+
+**Solution:** Add manual instrumentation for custom logic:
+
+```typescript
+import { traceChain } from "@arizeai/openinference-core";
+
+const myWorkflow = traceChain(
+  async (query: string): Promise<string> => {
+    const preprocessed = await preprocess(query);
+    const response = await client.chat.completions.create(...);
+    const postprocessed = await postprocess(response);
+    return postprocessed;
+  },
+  { name: "my-workflow" }
+);
+```
+
+## Combining Auto + Manual
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+import { traceChain } from "@arizeai/openinference-core";
+
+register({ projectName: "my-app" });
+
+const client = new OpenAI();
+
+const workflow = traceChain(
+  async (query: string) => {
+    const preprocessed = await preprocess(query);
+    const response = await client.chat.completions.create(...);  // Auto-instrumented
+    return postprocess(response);
+  },
+  { name: "my-workflow" }
+);
+```
@@ -0,0 +1,182 @@
+# Manual Instrumentation (Python)
+
+Add custom spans using decorators or context managers for fine-grained tracing control.
+
+## Setup
+
+```bash
+pip install arize-phoenix-otel
+```
+
+```python
+from phoenix.otel import register
+tracer_provider = register(project_name="my-app")
+tracer = tracer_provider.get_tracer(__name__)
+```
+
+## Quick Reference
+
+| Span Kind | Decorator | Use Case |
+|-----------|-----------|----------|
+| CHAIN | `@tracer.chain` | Orchestration, workflows, pipelines |
+| RETRIEVER | `@tracer.retriever` | Vector search, document retrieval |
+| TOOL | `@tracer.tool` | External API calls, function execution |
+| AGENT | `@tracer.agent` | Multi-step reasoning, planning |
+| LLM | `@tracer.llm` | LLM API calls (manual only) |
+| EMBEDDING | `@tracer.embedding` | Embedding generation |
+| RERANKER | `@tracer.reranker` | Document re-ranking |
+| GUARDRAIL | `@tracer.guardrail` | Safety checks, content moderation |
+| EVALUATOR | `@tracer.evaluator` | LLM evaluation, quality checks |
+
+## Decorator Approach (Recommended)
+
+**Use for:** Full function instrumentation, automatic I/O capture
+
+```python
+@tracer.chain
+def rag_pipeline(query: str) -> str:
+    docs = retrieve_documents(query)
+    ranked = rerank(docs, query)
+    return generate_response(ranked, query)
+
+@tracer.retriever
+def retrieve_documents(query: str) -> list[dict]:
+    results = vector_db.search(query, top_k=5)
+    return [{"content": doc.text, "score": doc.score} for doc in results]
+
+@tracer.tool
+def get_weather(city: str) -> str:
+    response = requests.get(f"https://api.weather.com/{city}")
+    return response.json()["weather"]
+```
+
+**Custom span names:**
+
+```python
+@tracer.chain(name="rag-pipeline-v2")
+def my_workflow(query: str) -> str:
+    return process(query)
+```
+
+## Context Manager Approach
+
+**Use for:** Partial function instrumentation, custom attributes, dynamic control
+
+```python
+from opentelemetry.trace import Status, StatusCode
+import json
+
+def retrieve_with_metadata(query: str):
+    with tracer.start_as_current_span(
+        "vector_search",
+        openinference_span_kind="retriever"
+    ) as span:
+        span.set_attribute("input.value", query)
+
+        results = vector_db.search(query, top_k=5)
+
+        documents = [
+            {
+                "document.id": doc.id,
+                "document.content": doc.text,
+                "document.score": doc.score
+            }
+            for doc in results
+        ]
+        span.set_attribute("retrieval.documents", json.dumps(documents))
+        span.set_status(Status(StatusCode.OK))
+
+        return documents
+```
+
+## Capturing Input/Output
+
+**Always capture I/O for evaluation-ready spans.**
+
+### Automatic I/O Capture (Decorators)
+
+Decorators automatically capture input arguments and return values:
+
+```python  theme={null}
+@tracer.chain
+def handle_query(user_input: str) -> str:
+    result = agent.generate(user_input)
+    return result.text
+
+# Automatically captures:
+# - input.value: user_input
+# - output.value: result.text
+# - input.mime_type / output.mime_type: auto-detected
+```
+
+### Manual I/O Capture (Context Manager)
+
+Use `set_input()` and `set_output()` for simple I/O capture:
+
+```python  theme={null}
+from opentelemetry.trace import Status, StatusCode
+
+def handle_query(user_input: str) -> str:
+    with tracer.start_as_current_span(
+        "query.handler",
+        openinference_span_kind="chain"
+    ) as span:
+        span.set_input(user_input)
+
+        result = agent.generate(user_input)
+
+        span.set_output(result.text)
+        span.set_status(Status(StatusCode.OK))
+
+        return result.text
+```
+
+**What gets captured:**
+
+```json
+{
+  "input.value": "What is 2+2?",
+  "input.mime_type": "text/plain",
+  "output.value": "2+2 equals 4.",
+  "output.mime_type": "text/plain"
+}
+```
+
+**Why this matters:**
+- Phoenix evaluators require `input.value` and `output.value`
+- Phoenix UI displays I/O prominently for debugging
+- Enables exporting data for fine-tuning datasets
+
+### Custom I/O with Additional Metadata
+
+Use `set_attribute()` for custom attributes alongside I/O:
+
+```python  theme={null}
+def process_query(query: str):
+    with tracer.start_as_current_span(
+        "query.process",
+        openinference_span_kind="chain"
+    ) as span:
+        # Standard I/O
+        span.set_input(query)
+
+        # Custom metadata
+        span.set_attribute("input.length", len(query))
+
+        result = llm.generate(query)
+
+        # Standard output
+        span.set_output(result.text)
+
+        # Custom metadata
+        span.set_attribute("output.tokens", result.usage.total_tokens)
+        span.set_status(Status(StatusCode.OK))
+
+        return result
+```
+
+## See Also
+
+- **Span attributes:** `span-chain.md`, `span-retriever.md`, `span-tool.md`, `span-llm.md`, `span-agent.md`, `span-embedding.md`, `span-reranker.md`, `span-guardrail.md`, `span-evaluator.md`
+- **Auto-instrumentation:** `instrumentation-auto-python.md` for framework integrations
+- **API docs:** https://docs.arize.com/phoenix/tracing/manual-instrumentation
@@ -0,0 +1,172 @@
+# Manual Instrumentation (TypeScript)
+
+Add custom spans using convenience wrappers or withSpan for fine-grained tracing control.
+
+## Setup
+
+```bash
+npm install @arizeai/phoenix-otel @arizeai/openinference-core
+```
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+register({ projectName: "my-app" });
+```
+
+## Quick Reference
+
+| Span Kind | Method | Use Case |
+|-----------|--------|----------|
+| CHAIN | `traceChain` | Workflows, pipelines, orchestration |
+| AGENT | `traceAgent` | Multi-step reasoning, planning |
+| TOOL | `traceTool` | External APIs, function calls |
+| RETRIEVER | `withSpan` | Vector search, document retrieval |
+| LLM | `withSpan` | LLM API calls (prefer auto-instrumentation) |
+| EMBEDDING | `withSpan` | Embedding generation |
+| RERANKER | `withSpan` | Document re-ranking |
+| GUARDRAIL | `withSpan` | Safety checks, content moderation |
+| EVALUATOR | `withSpan` | LLM evaluation |
+
+## Convenience Wrappers
+
+```typescript
+import { traceChain, traceAgent, traceTool } from "@arizeai/openinference-core";
+
+// CHAIN - workflows
+const pipeline = traceChain(
+  async (query: string) => {
+    const docs = await retrieve(query);
+    return await generate(docs, query);
+  },
+  { name: "rag-pipeline" }
+);
+
+// AGENT - reasoning
+const agent = traceAgent(
+  async (question: string) => {
+    const thought = await llm.generate(`Think: ${question}`);
+    return await processThought(thought);
+  },
+  { name: "my-agent" }
+);
+
+// TOOL - function calls
+const getWeather = traceTool(
+  async (city: string) => fetch(`/api/weather/${city}`).then(r => r.json()),
+  { name: "get-weather" }
+);
+```
+
+## withSpan for Other Kinds
+
+```typescript
+import { withSpan, getInputAttributes, getRetrieverAttributes } from "@arizeai/openinference-core";
+
+// RETRIEVER with custom attributes
+const retrieve = withSpan(
+  async (query: string) => {
+    const results = await vectorDb.search(query, { topK: 5 });
+    return results.map(doc => ({ content: doc.text, score: doc.score }));
+  },
+  {
+    kind: "RETRIEVER",
+    name: "vector-search",
+    processInput: (query) => getInputAttributes(query),
+    processOutput: (docs) => getRetrieverAttributes({ documents: docs })
+  }
+);
+```
+
+**Options:**
+
+```typescript
+withSpan(fn, {
+  kind: "RETRIEVER",              // OpenInference span kind
+  name: "span-name",              // Span name (defaults to function name)
+  processInput: (args) => {},     // Transform input to attributes
+  processOutput: (result) => {},  // Transform output to attributes
+  attributes: { key: "value" }    // Static attributes
+});
+```
+
+## Capturing Input/Output
+
+**Always capture I/O for evaluation-ready spans.** Use `getInputAttributes` and `getOutputAttributes` helpers for automatic MIME type detection:
+
+```typescript
+import {
+  getInputAttributes,
+  getOutputAttributes,
+  withSpan,
+} from "@arizeai/openinference-core";
+
+const handleQuery = withSpan(
+  async (userInput: string) => {
+    const result = await agent.generate({ prompt: userInput });
+    return result;
+  },
+  {
+    name: "query.handler",
+    kind: "CHAIN",
+    // Use helpers - automatic MIME type detection
+    processInput: (input) => getInputAttributes(input),
+    processOutput: (result) => getOutputAttributes(result.text),
+  }
+);
+
+await handleQuery("What is 2+2?");
+```
+
+**What gets captured:**
+
+```json
+{
+  "input.value": "What is 2+2?",
+  "input.mime_type": "text/plain",
+  "output.value": "2+2 equals 4.",
+  "output.mime_type": "text/plain"
+}
+```
+
+**Helper behavior:**
+- Strings → `text/plain`
+- Objects/Arrays → `application/json` (automatically serialized)
+- `undefined`/`null` → No attributes set
+
+**Why this matters:**
+- Phoenix evaluators require `input.value` and `output.value`
+- Phoenix UI displays I/O prominently for debugging
+- Enables exporting data for fine-tuning datasets
+
+### Custom I/O Processing
+
+Add custom metadata alongside standard I/O attributes:
+
+```typescript
+const processWithMetadata = withSpan(
+  async (query: string) => {
+    const result = await llm.generate(query);
+    return result;
+  },
+  {
+    name: "query.process",
+    kind: "CHAIN",
+    processInput: (query) => ({
+      "input.value": query,
+      "input.mime_type": "text/plain",
+      "input.length": query.length,  // Custom attribute
+    }),
+    processOutput: (result) => ({
+      "output.value": result.text,
+      "output.mime_type": "text/plain",
+      "output.tokens": result.usage?.totalTokens,  // Custom attribute
+    }),
+  }
+);
+```
+
+## See Also
+
+- **Span attributes:** `span-chain.md`, `span-retriever.md`, `span-tool.md`, etc.
+- **Attribute helpers:** https://docs.arize.com/phoenix/tracing/manual-instrumentation-typescript#attribute-helpers
+- **Auto-instrumentation:** `instrumentation-auto-typescript.md` for framework integrations
@@ -0,0 +1,87 @@
+# Phoenix Tracing: Custom Metadata (Python)
+
+Add custom attributes to spans for richer observability.
+
+## Install
+
+```bash
+pip install openinference-instrumentation
+```
+
+## Session
+
+```python
+from openinference.instrumentation import using_session
+
+with using_session(session_id="my-session-id"):
+    # Spans get: "session.id" = "my-session-id"
+    ...
+```
+
+## User
+
+```python
+from openinference.instrumentation import using_user
+
+with using_user("my-user-id"):
+    # Spans get: "user.id" = "my-user-id"
+    ...
+```
+
+## Metadata
+
+```python
+from openinference.instrumentation import using_metadata
+
+with using_metadata({"key": "value", "experiment_id": "exp_123"}):
+    # Spans get: "metadata" = '{"key": "value", "experiment_id": "exp_123"}'
+    ...
+```
+
+## Tags
+
+```python
+from openinference.instrumentation import using_tags
+
+with using_tags(["tag_1", "tag_2"]):
+    # Spans get: "tag.tags" = '["tag_1", "tag_2"]'
+    ...
+```
+
+## Combined (using_attributes)
+
+```python
+from openinference.instrumentation import using_attributes
+
+with using_attributes(
+    session_id="my-session-id",
+    user_id="my-user-id",
+    metadata={"environment": "production"},
+    tags=["prod", "v2"],
+    prompt_template="Answer: {question}",
+    prompt_template_version="v1.0",
+    prompt_template_variables={"question": "What is Phoenix?"},
+):
+    # All attributes applied to spans in this context
+    ...
+```
+
+## On a Single Span
+
+```python
+span.set_attribute("metadata", json.dumps({"key": "value"}))
+span.set_attribute("user.id", "user_123")
+span.set_attribute("session.id", "session_456")
+```
+
+## As Decorators
+
+All context managers can be used as decorators:
+
+```python
+@using_session(session_id="my-session-id")
+@using_user("my-user-id")
+@using_metadata({"env": "prod"})
+def my_function():
+    ...
+```
@@ -0,0 +1,50 @@
+# Phoenix Tracing: Custom Metadata (TypeScript)
+
+Add custom attributes to spans for richer observability.
+
+## Using Context (Propagates to All Child Spans)
+
+```typescript
+import { context } from "@arizeai/phoenix-otel";
+import { setMetadata } from "@arizeai/openinference-core";
+
+context.with(
+  setMetadata(context.active(), {
+    experiment_id: "exp_123",
+    model_version: "gpt-4-1106-preview",
+    environment: "production",
+  }),
+  async () => {
+    // All spans created within this block will have:
+    // "metadata" = '{"experiment_id": "exp_123", ...}'
+    await myApp.run(query);
+  }
+);
+```
+
+## On a Single Span
+
+```typescript
+import { traceChain } from "@arizeai/openinference-core";
+import { trace } from "@arizeai/phoenix-otel";
+
+const myFunction = traceChain(
+  async (input: string) => {
+    const span = trace.getActiveSpan();
+
+    span?.setAttribute(
+      "metadata",
+      JSON.stringify({
+        experiment_id: "exp_123",
+        model_version: "gpt-4-1106-preview",
+        environment: "production",
+      })
+    );
+
+    return result;
+  },
+  { name: "my-function" }
+);
+
+await myFunction("hello");
+```
@@ -0,0 +1,58 @@
+# Phoenix Tracing: Production Guide (Python)
+
+**CRITICAL: Configure batching, data masking, and span filtering for production deployment.**
+
+## Metadata
+
+| Attribute | Value |
+|-----------|-------|
+| Priority | Critical - production readiness |
+| Impact | Security, Performance |
+| Setup Time | 5-15 min |
+
+## Batch Processing
+
+**Enable batch processing for production efficiency.** Batching reduces network overhead by sending spans in groups rather than individually.
+
+## Data Masking (PII Protection)
+
+**Environment variables:**
+
+```bash
+export OPENINFERENCE_HIDE_INPUTS=true          # Hide input.value
+export OPENINFERENCE_HIDE_OUTPUTS=true         # Hide output.value
+export OPENINFERENCE_HIDE_INPUT_MESSAGES=true  # Hide LLM input messages
+export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
+export OPENINFERENCE_HIDE_INPUT_IMAGES=true    # Hide image content
+export OPENINFERENCE_HIDE_INPUT_TEXT=true      # Hide embedding text
+export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000  # Limit image size
+```
+
+**Python TraceConfig:**
+
+```python
+from phoenix.otel import register
+from openinference.instrumentation import TraceConfig
+
+config = TraceConfig(
+    hide_inputs=True,
+    hide_outputs=True,
+    hide_input_messages=True
+)
+register(trace_config=config)
+```
+
+**Precedence:** Code > Environment variables > Defaults
+
+---
+
+## Span Filtering
+
+**Suppress specific code blocks:**
+
+```python
+from phoenix.otel import suppress_tracing
+
+with suppress_tracing():
+    internal_logging()  # No spans generated
+```
@@ -0,0 +1,148 @@
+# Phoenix Tracing: Production Guide (TypeScript)
+
+**CRITICAL: Configure batching, data masking, and span filtering for production deployment.**
+
+## Metadata
+
+| Attribute | Value |
+|-----------|-------|
+| Priority | Critical - production readiness |
+| Impact | Security, Performance |
+| Setup Time | 5-15 min |
+
+## Batch Processing
+
+**Enable batch processing for production efficiency.** Batching reduces network overhead by sending spans in groups rather than individually.
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+
+const provider = register({
+  projectName: "my-app",
+  batch: true,  // Production default
+});
+```
+
+### Shutdown Handling
+
+**CRITICAL:** Spans may not be exported if still queued in the processor when your process exits. Call `provider.shutdown()` to explicitly flush before exit.
+
+```typescript
+// Explicit shutdown to flush queued spans
+const provider = register({
+  projectName: "my-app",
+  batch: true,
+});
+
+async function main() {
+  await doWork();
+  await provider.shutdown();  // Flush spans before exit
+}
+
+main().catch(async (error) => {
+  console.error(error);
+  await provider.shutdown();  // Flush on error too
+  process.exit(1);
+});
+```
+
+**Graceful termination signals:**
+
+```typescript
+// Graceful shutdown on SIGTERM
+const provider = register({
+  projectName: "my-server",
+  batch: true,
+});
+
+process.on("SIGTERM", async () => {
+  await provider.shutdown();
+  process.exit(0);
+});
+```
+
+---
+
+## Data Masking (PII Protection)
+
+**Environment variables:**
+
+```bash
+export OPENINFERENCE_HIDE_INPUTS=true          # Hide input.value
+export OPENINFERENCE_HIDE_OUTPUTS=true         # Hide output.value
+export OPENINFERENCE_HIDE_INPUT_MESSAGES=true  # Hide LLM input messages
+export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
+export OPENINFERENCE_HIDE_INPUT_IMAGES=true    # Hide image content
+export OPENINFERENCE_HIDE_INPUT_TEXT=true      # Hide embedding text
+export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000  # Limit image size
+```
+
+**TypeScript TraceConfig:**
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
+
+const traceConfig = {
+  hideInputs: true,
+  hideOutputs: true,
+  hideInputMessages: true
+};
+
+const instrumentation = new OpenAIInstrumentation({ traceConfig });
+```
+
+**Precedence:** Code > Environment variables > Defaults
+
+---
+
+## Span Filtering
+
+**Suppress specific code blocks:**
+
+```typescript
+import { suppressTracing } from "@opentelemetry/core";
+import { context } from "@opentelemetry/api";
+
+await context.with(suppressTracing(context.active()), async () => {
+  internalLogging(); // No spans generated
+});
+```
+
+**Sampling:**
+
+```bash
+export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
+export OTEL_TRACES_SAMPLER_ARG="0.1"  # Sample 10%
+```
+
+---
+
+## Error Handling
+
+```typescript
+import { SpanStatusCode } from "@opentelemetry/api";
+
+try {
+  result = await riskyOperation();
+  span?.setStatus({ code: SpanStatusCode.OK });
+} catch (e) {
+  span?.recordException(e);
+  span?.setStatus({ code: SpanStatusCode.ERROR });
+  throw e;
+}
+```
+
+---
+
+## Production Checklist
+
+- [ ] Batch processing enabled
+- [ ] **Shutdown handling:** Call `provider.shutdown()` before exit to flush queued spans
+- [ ] **Graceful termination:** Flush spans on SIGTERM/SIGINT signals
+- [ ] Data masking configured (`HIDE_INPUTS`/`HIDE_OUTPUTS` if PII)
+- [ ] Span filtering for health checks/noisy paths
+- [ ] Error handling implemented
+- [ ] Graceful degradation if Phoenix unavailable
+- [ ] Performance tested
+- [ ] Monitoring configured (Phoenix UI checked)
@@ -0,0 +1,73 @@
+# Phoenix Tracing: Projects (Python)
+
+**Organize traces by application using projects (Phoenix's top-level grouping).**
+
+## Overview
+
+Projects group traces for a single application or experiment.
+
+**Use for:** Environments (dev/staging/prod), A/B testing, versioning
+
+## Setup
+
+### Environment Variable (Recommended)
+
+```bash
+export PHOENIX_PROJECT_NAME="my-app-prod"
+```
+
+```python
+import os
+os.environ["PHOENIX_PROJECT_NAME"] = "my-app-prod"
+from phoenix.otel import register
+register()  # Uses "my-app-prod"
+```
+
+### Code
+
+```python
+from phoenix.otel import register
+register(project_name="my-app-prod")
+```
+
+## Use Cases
+
+**Environments:**
+
+```python
+# Dev, staging, prod
+register(project_name="my-app-dev")
+register(project_name="my-app-staging")
+register(project_name="my-app-prod")
+```
+
+**A/B Testing:**
+
+```python
+# Compare models
+register(project_name="chatbot-gpt4")
+register(project_name="chatbot-claude")
+```
+
+**Versioning:**
+
+```python
+# Track versions
+register(project_name="my-app-v1")
+register(project_name="my-app-v2")
+```
+
+## Switching Projects (Python Notebooks Only)
+
+```python
+from openinference.instrumentation import dangerously_using_project
+from phoenix.otel import register
+
+register(project_name="my-app")
+
+# Switch temporarily for evals
+with dangerously_using_project("my-eval-project"):
+    run_evaluations()
+```
+
+**⚠️ Only use in notebooks/scripts, not production.**
@@ -0,0 +1,54 @@
+# Phoenix Tracing: Projects (TypeScript)
+
+**Organize traces by application using projects (Phoenix's top-level grouping).**
+
+## Overview
+
+Projects group traces for a single application or experiment.
+
+**Use for:** Environments (dev/staging/prod), A/B testing, versioning
+
+## Setup
+
+### Environment Variable (Recommended)
+
+```bash
+export PHOENIX_PROJECT_NAME="my-app-prod"
+```
+
+```typescript
+process.env.PHOENIX_PROJECT_NAME = "my-app-prod";
+import { register } from "@arizeai/phoenix-otel";
+register();  // Uses "my-app-prod"
+```
+
+### Code
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+register({ projectName: "my-app-prod" });
+```
+
+## Use Cases
+
+**Environments:**
+```typescript
+// Dev, staging, prod
+register({ projectName: "my-app-dev" });
+register({ projectName: "my-app-staging" });
+register({ projectName: "my-app-prod" });
+```
+
+**A/B Testing:**
+```typescript
+// Compare models
+register({ projectName: "chatbot-gpt4" });
+register({ projectName: "chatbot-claude" });
+```
+
+**Versioning:**
+```typescript
+// Track versions
+register({ projectName: "my-app-v1" });
+register({ projectName: "my-app-v2" });
+```
@@ -0,0 +1,104 @@
+# Sessions (Python)
+
+Track multi-turn conversations by grouping traces with session IDs.
+
+## Setup
+
+```python
+from openinference.instrumentation import using_session
+
+with using_session(session_id="user_123_conv_456"):
+    response = llm.invoke(prompt)
+```
+
+## Best Practices
+
+**Bad: Only parent span gets session ID**
+
+```python
+from openinference.semconv.trace import SpanAttributes
+from opentelemetry import trace
+
+span = trace.get_current_span()
+span.set_attribute(SpanAttributes.SESSION_ID, session_id)
+response = client.chat.completions.create(...)
+```
+
+**Good: All child spans inherit session ID**
+
+```python
+with using_session(session_id):
+    response = client.chat.completions.create(...)
+    result = my_custom_function()
+```
+
+**Why:** `using_session()` propagates session ID to all nested spans automatically.
+
+## Session ID Patterns
+
+```python
+import uuid
+
+session_id = str(uuid.uuid4())
+session_id = f"user_{user_id}_conv_{conversation_id}"
+session_id = f"debug_{timestamp}"
+```
+
+Good: `str(uuid.uuid4())`, `"user_123_conv_456"`
+Bad: `"session_1"`, `"test"`, empty string
+
+## Multi-Turn Chatbot Example
+
+```python
+import uuid
+from openinference.instrumentation import using_session
+
+session_id = str(uuid.uuid4())
+messages = []
+
+def send_message(user_input: str) -> str:
+    messages.append({"role": "user", "content": user_input})
+
+    with using_session(session_id):
+        response = client.chat.completions.create(
+            model="gpt-4",
+            messages=messages
+        )
+
+    assistant_message = response.choices[0].message.content
+    messages.append({"role": "assistant", "content": assistant_message})
+    return assistant_message
+```
+
+## Additional Attributes
+
+```python
+from openinference.instrumentation import using_attributes
+
+with using_attributes(
+    user_id="user_123",
+    session_id="conv_456",
+    metadata={"tier": "premium", "region": "us-west"}
+):
+    response = llm.invoke(prompt)
+```
+
+## LangChain Integration
+
+LangChain threads are automatically recognized as sessions:
+
+```python
+from langchain.chat_models import ChatOpenAI
+
+response = llm.invoke(
+    [HumanMessage(content="Hi!")],
+    config={"metadata": {"thread_id": "user_123_thread"}}
+)
+```
+
+Phoenix recognizes: `thread_id`, `session_id`, `conversation_id`
+
+## See Also
+
+- **TypeScript sessions:** `sessions-typescript.md`
+- **Session docs:** https://docs.arize.com/phoenix/tracing/sessions
@@ -0,0 +1,199 @@
+# Sessions (TypeScript)
+
+Track multi-turn conversations by grouping traces with session IDs. **Use `withSpan` directly from `@arizeai/openinference-core`** - no wrappers or custom utilities needed.
+
+## Core Concept
+
+**Session Pattern:**
+1. Generate a unique `session.id` once at application startup
+2. Export SESSION_ID, import `withSpan` where needed
+3. Use `withSpan` to create a parent CHAIN span with `session.id` for each interaction
+4. All child spans (LLM, TOOL, AGENT, etc.) automatically group under the parent
+5. Query traces by `session.id` in Phoenix to see all interactions
+
+## Implementation (Best Practice)
+
+### 1. Setup (instrumentation.ts)
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+import { randomUUID } from "node:crypto";
+
+// Initialize Phoenix
+register({
+  projectName: "your-app",
+  url: process.env.PHOENIX_COLLECTOR_ENDPOINT || "http://localhost:6006",
+  apiKey: process.env.PHOENIX_API_KEY,
+  batch: true,
+});
+
+// Generate and export session ID
+export const SESSION_ID = randomUUID();
+```
+
+### 2. Usage (app code)
+
+```typescript
+import { withSpan } from "@arizeai/openinference-core";
+import { SESSION_ID } from "./instrumentation";
+
+// Use withSpan directly - no wrapper needed
+const handleInteraction = withSpan(
+  async () => {
+    const result = await agent.generate({ prompt: userInput });
+    return result;
+  },
+  {
+    name: "cli.interaction",
+    kind: "CHAIN",
+    attributes: { "session.id": SESSION_ID },
+  }
+);
+
+// Call it
+const result = await handleInteraction();
+```
+
+### With Input Parameters
+
+```typescript
+const processQuery = withSpan(
+  async (query: string) => {
+    return await agent.generate({ prompt: query });
+  },
+  {
+    name: "process.query",
+    kind: "CHAIN",
+    attributes: { "session.id": SESSION_ID },
+  }
+);
+
+await processQuery("What is 2+2?");
+```
+
+## Key Points
+
+### Session ID Scope
+- **CLI/Desktop Apps**: Generate once at process startup
+- **Web Servers**: Generate per-user session (e.g., on login, store in session storage)
+- **Stateless APIs**: Accept session.id as a parameter from client
+
+### Span Hierarchy
+```
+cli.interaction (CHAIN) ← session.id here
+├── ai.generateText (AGENT)
+│   ├── ai.generateText.doGenerate (LLM)
+│   └── ai.toolCall (TOOL)
+└── ai.generateText.doGenerate (LLM)
+```
+
+The `session.id` is only set on the **root span**. Child spans are automatically grouped by the trace hierarchy.
+
+### Querying Sessions
+
+```bash
+# Get all traces for a session
+npx @arizeai/phoenix-cli traces \
+  --endpoint http://localhost:6006 \
+  --project your-app \
+  --format raw \
+  --no-progress | \
+  jq '.[] | select(.spans[0].attributes["session.id"] == "YOUR-SESSION-ID")'
+```
+
+## Dependencies
+
+```json
+{
+  "dependencies": {
+    "@arizeai/openinference-core": "^2.0.5",
+    "@arizeai/phoenix-otel": "^0.4.1"
+  }
+}
+```
+
+**Note:** `@opentelemetry/api` is NOT needed - it's only for manual span management.
+
+## Why This Pattern?
+
+1. **Simple**: Just export SESSION_ID, use withSpan directly - no wrappers
+2. **Built-in**: `withSpan` from `@arizeai/openinference-core` handles everything
+3. **Type-safe**: Preserves function signatures and type information
+4. **Automatic lifecycle**: Handles span creation, error tracking, and cleanup
+5. **Framework-agnostic**: Works with any LLM framework (AI SDK, LangChain, etc.)
+6. **No extra deps**: Don't need `@opentelemetry/api` or custom utilities
+
+## Adding More Attributes
+
+```typescript
+import { withSpan } from "@arizeai/openinference-core";
+import { SESSION_ID } from "./instrumentation";
+
+const handleWithContext = withSpan(
+  async (userInput: string) => {
+    return await agent.generate({ prompt: userInput });
+  },
+  {
+    name: "cli.interaction",
+    kind: "CHAIN",
+    attributes: {
+      "session.id": SESSION_ID,
+      "user.id": userId,              // Track user
+      "metadata.environment": "prod",  // Custom metadata
+    },
+  }
+);
+```
+
+## Anti-Pattern: Don't Create Wrappers
+
+❌ **Don't do this:**
+```typescript
+// Unnecessary wrapper
+export function withSessionTracking(fn) {
+  return withSpan(fn, { attributes: { "session.id": SESSION_ID } });
+}
+```
+
+✅ **Do this instead:**
+```typescript
+// Use withSpan directly
+import { withSpan } from "@arizeai/openinference-core";
+import { SESSION_ID } from "./instrumentation";
+
+const handler = withSpan(fn, {
+  attributes: { "session.id": SESSION_ID }
+});
+```
+
+## Alternative: Context API Pattern
+
+For web servers or complex async flows where you need to propagate session IDs through middleware, you can use the Context API:
+
+```typescript
+import { context } from "@opentelemetry/api";
+import { setSession } from "@arizeai/openinference-core";
+
+await context.with(
+  setSession(context.active(), { sessionId: "user_123_conv_456" }),
+  async () => {
+    const response = await llm.invoke(prompt);
+  }
+);
+```
+
+**Use Context API when:**
+- Building web servers with middleware chains
+- Session ID needs to flow through many async boundaries
+- You don't control the call stack (e.g., framework-provided handlers)
+
+**Use withSpan when:**
+- Building CLI apps or scripts
+- You control the function call points
+- Simpler, more explicit code is preferred
+
+## Related
+
+- `fundamentals-universal-attributes.md` - Other universal attributes (user.id, metadata)
+- `span-chain.md` - CHAIN span specification
+- `sessions-python.md` - Python session tracking patterns
@@ -0,0 +1,131 @@
+# Phoenix Tracing: Python Setup
+
+**Setup Phoenix tracing in Python with `arize-phoenix-otel`.**
+
+## Metadata
+
+| Attribute  | Value                               |
+| ---------- | ----------------------------------- |
+| Priority   | Critical - required for all tracing |
+| Setup Time | <5 min                              |
+
+## Quick Start (3 lines)
+
+```python
+from phoenix.otel import register
+register(project_name="my-app", auto_instrument=True)
+```
+
+**Connects to `http://localhost:6006`, auto-instruments all supported libraries.**
+
+## Installation
+
+```bash
+pip install arize-phoenix-otel
+```
+
+**Supported:** Python 3.10-3.13
+
+## Configuration
+
+### Environment Variables (Recommended)
+
+```bash
+export PHOENIX_API_KEY="your-api-key"  # Required for Phoenix Cloud
+export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"  # Or Cloud URL
+export PHOENIX_PROJECT_NAME="my-app"  # Optional
+```
+
+### Python Code
+
+```python
+from phoenix.otel import register
+
+tracer_provider = register(
+    project_name="my-app",              # Project name
+    endpoint="http://localhost:6006",   # Phoenix endpoint
+    auto_instrument=True,               # Auto-instrument supported libs
+    batch=True,                         # Batch processing (default: True)
+)
+```
+
+**Parameters:**
+
+- `project_name`: Project name (overrides `PHOENIX_PROJECT_NAME`)
+- `endpoint`: Phoenix URL (overrides `PHOENIX_COLLECTOR_ENDPOINT`)
+- `auto_instrument`: Enable auto-instrumentation (default: False)
+- `batch`: Use BatchSpanProcessor (default: True, production-recommended)
+- `protocol`: `"http/protobuf"` (default) or `"grpc"`
+
+## Auto-Instrumentation
+
+Install instrumentors for your frameworks:
+
+```bash
+pip install openinference-instrumentation-openai      # OpenAI SDK
+pip install openinference-instrumentation-langchain   # LangChain
+pip install openinference-instrumentation-llama-index # LlamaIndex
+# ... install others as needed
+```
+
+Then enable auto-instrumentation:
+
+```python
+register(project_name="my-app", auto_instrument=True)
+```
+
+Phoenix discovers and instruments all installed OpenInference packages automatically.
+
+## Batch Processing (Production)
+
+Enabled by default. Configure via environment variables:
+
+```bash
+export OTEL_BSP_SCHEDULE_DELAY=5000           # Batch every 5s
+export OTEL_BSP_MAX_QUEUE_SIZE=2048           # Queue 2048 spans
+export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=512     # Send 512 spans/batch
+```
+
+**Link:** https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
+
+## Verification
+
+1. Open Phoenix UI: `http://localhost:6006`
+2. Navigate to your project
+3. Run your application
+4. Check for traces (appear within batch delay)
+
+## Troubleshooting
+
+**No traces:**
+
+- Verify `PHOENIX_COLLECTOR_ENDPOINT` matches Phoenix server
+- Set `PHOENIX_API_KEY` for Phoenix Cloud
+- Confirm instrumentors installed
+
+**Missing attributes:**
+
+- Check span kind (see rules/ directory)
+- Verify attribute names (see rules/ directory)
+
+## Example
+
+```python
+from phoenix.otel import register
+from openai import OpenAI
+
+# Enable tracing with auto-instrumentation
+register(project_name="my-chatbot", auto_instrument=True)
+
+# OpenAI automatically instrumented
+client = OpenAI()
+response = client.chat.completions.create(
+    model="gpt-4",
+    messages=[{"role": "user", "content": "Hello!"}]
+)
+```
+
+## API Reference
+
+- [Python OTEL API Docs](https://arize-phoenix.readthedocs.io/projects/otel/en/latest/)
+- [Python Client API Docs](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
@@ -0,0 +1,170 @@
+# TypeScript Setup
+
+Setup Phoenix tracing in TypeScript/JavaScript with `@arizeai/phoenix-otel`.
+
+## Metadata
+
+| Attribute | Value |
+|-----------|-------|
+| Priority | Critical - required for all tracing |
+| Setup Time | <5 min |
+
+## Quick Start
+
+```bash
+npm install @arizeai/phoenix-otel
+```
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+register({ projectName: "my-app" });
+```
+
+Connects to `http://localhost:6006` by default.
+
+## Configuration
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+
+register({
+  projectName: "my-app",
+  url: "http://localhost:6006",
+  apiKey: process.env.PHOENIX_API_KEY,
+  batch: true
+});
+```
+
+**Environment variables:**
+
+```bash
+export PHOENIX_API_KEY="your-api-key"
+export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"
+export PHOENIX_PROJECT_NAME="my-app"
+```
+
+## ESM vs CommonJS
+
+**CommonJS (automatic):**
+
+```javascript
+const { register } = require("@arizeai/phoenix-otel");
+register({ projectName: "my-app" });
+
+const OpenAI = require("openai");
+```
+
+**ESM (manual instrumentation required):**
+
+```typescript
+import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
+import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
+import OpenAI from "openai";
+
+register({ projectName: "my-app" });
+
+const instrumentation = new OpenAIInstrumentation();
+instrumentation.manuallyInstrument(OpenAI);
+registerInstrumentations({ instrumentations: [instrumentation] });
+```
+
+**Why:** ESM imports are hoisted, so `manuallyInstrument()` is needed.
+
+## Framework Integration
+
+**Next.js (App Router):**
+
+```typescript
+// instrumentation.ts
+export async function register() {
+  if (process.env.NEXT_RUNTIME === "nodejs") {
+    const { register } = await import("@arizeai/phoenix-otel");
+    register({ projectName: "my-nextjs-app" });
+  }
+}
+```
+
+**Express.js:**
+
+```typescript
+import { register } from "@arizeai/phoenix-otel";
+
+register({ projectName: "my-express-app" });
+
+const app = express();
+```
+
+## Flushing Spans Before Exit
+
+**CRITICAL:** Spans may not be exported if still queued in the processor when your process exits. Call `provider.shutdown()` to explicitly flush before exit.
+
+**Standard pattern:**
+
+```typescript
+const provider = register({
+  projectName: "my-app",
+  batch: true,
+});
+
+async function main() {
+  await doWork();
+  await provider.shutdown();  // Flush spans before exit
+}
+
+main().catch(async (error) => {
+  console.error(error);
+  await provider.shutdown();  // Flush on error too
+  process.exit(1);
+});
+```
+
+**Alternative:**
+
+```typescript
+// Use batch: false for immediate export (no shutdown needed)
+register({
+  projectName: "my-app",
+  batch: false,
+});
+```
+
+For production patterns including graceful termination, see `production-typescript.md`.
+
+## Verification
+
+1. Open Phoenix UI: `http://localhost:6006`
+2. Run your application
+3. Check for traces in your project
+
+**Enable diagnostic logging:**
+
+```typescript
+import { DiagLogLevel, register } from "@arizeai/phoenix-otel";
+
+register({
+  projectName: "my-app",
+  diagLogLevel: DiagLogLevel.DEBUG,
+});
+```
+
+## Troubleshooting
+
+**No traces:**
+- Verify `PHOENIX_COLLECTOR_ENDPOINT` is correct
+- Set `PHOENIX_API_KEY` for Phoenix Cloud
+- For ESM: Ensure `manuallyInstrument()` is called
+- **With `batch: true`:** Call `provider.shutdown()` before exit to flush queued spans (see Flushing Spans section)
+
+**Traces missing:**
+- With `batch: true`: Call `await provider.shutdown()` before process exit to flush queued spans
+- Alternative: Set `batch: false` for immediate export (no shutdown needed)
+
+**Missing attributes:**
+- Check instrumentation is registered (ESM requires manual setup)
+- See `instrumentation-auto-typescript.md`
+
+## See Also
+
+- **Auto-instrumentation:** `instrumentation-auto-typescript.md`
+- **Manual instrumentation:** `instrumentation-manual-typescript.md`
+- **API docs:** https://arize-ai.github.io/phoenix/
@@ -0,0 +1,15 @@
+# AGENT Spans
+
+AGENT spans represent autonomous reasoning blocks (ReAct agents, planning loops, multi-step decision making).
+
+**Required:** `openinference.span.kind` = "AGENT"
+
+## Example
+
+```json
+{
+  "openinference.span.kind": "AGENT",
+  "input.value": "Book a flight to New York for next Monday",
+  "output.value": "I've booked flight AA123 departing Monday at 9:00 AM"
+}
+```
@@ -0,0 +1,43 @@
+# CHAIN Spans
+
+## Purpose
+
+CHAIN spans represent orchestration layers in your application (LangChain chains, custom workflows, application entry points). Often used as root spans.
+
+## Required Attributes
+
+| Attribute                 | Type   | Description     | Required |
+| ------------------------- | ------ | --------------- | -------- |
+| `openinference.span.kind` | String | Must be "CHAIN" | Yes      |
+
+## Common Attributes
+
+CHAIN spans typically use [Universal Attributes](fundamentals-universal-attributes.md):
+
+- `input.value` - Input to the chain (user query, request payload)
+- `output.value` - Output from the chain (final response)
+- `input.mime_type` / `output.mime_type` - Format indicators
+
+## Example: Root Chain
+
+```json
+{
+  "openinference.span.kind": "CHAIN",
+  "input.value": "{\"question\": \"What is the capital of France?\"}",
+  "input.mime_type": "application/json",
+  "output.value": "{\"answer\": \"The capital of France is Paris.\", \"sources\": [\"doc_123\"]}",
+  "output.mime_type": "application/json",
+  "session.id": "session_abc123",
+  "user.id": "user_xyz789"
+}
+```
+
+## Example: Nested Sub-Chain
+
+```json
+{
+  "openinference.span.kind": "CHAIN",
+  "input.value": "Summarize this document: ...",
+  "output.value": "This document discusses..."
+}
+```
@@ -0,0 +1,91 @@
+# EMBEDDING Spans
+
+## Purpose
+
+EMBEDDING spans represent vector generation operations (text-to-vector conversion for semantic search).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "EMBEDDING" | Yes |
+| `embedding.model_name` | String | Embedding model identifier | Recommended |
+
+## Attribute Reference
+
+### Single Embedding
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `embedding.model_name` | String | Embedding model identifier |
+| `embedding.text` | String | Input text to embed |
+| `embedding.vector` | String (JSON array) | Generated embedding vector |
+
+**Example:**
+```json
+{
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.text": "What is machine learning?",
+  "embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]"
+}
+```
+
+### Batch Embeddings
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `embedding.embeddings.{i}.embedding.text` | String | Text at index i |
+| `embedding.embeddings.{i}.embedding.vector` | String (JSON array) | Vector at index i |
+
+**Example:**
+```json
+{
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.embeddings.0.embedding.text": "First document",
+  "embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3, ..., 0.5]",
+  "embedding.embeddings.1.embedding.text": "Second document",
+  "embedding.embeddings.1.embedding.vector": "[0.6, 0.7, 0.8, ..., 0.9]"
+}
+```
+
+### Vector Format
+
+Vectors stored as JSON array strings:
+- Dimensions: Typically 384, 768, 1536, or 3072
+- Format: `"[0.123, -0.456, 0.789, ...]"`
+- Precision: Usually 3-6 decimal places
+
+**Storage Considerations:**
+- Large vectors can significantly increase trace size
+- Consider omitting vectors in production (keep `embedding.text` for debugging)
+- Use separate vector database for actual similarity search
+
+## Examples
+
+### Single Embedding
+
+```json
+{
+  "openinference.span.kind": "EMBEDDING",
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.text": "What is machine learning?",
+  "embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]",
+  "input.value": "What is machine learning?",
+  "output.value": "[0.023, -0.012, 0.045, ..., 0.001]"
+}
+```
+
+### Batch Embeddings
+
+```json
+{
+  "openinference.span.kind": "EMBEDDING",
+  "embedding.model_name": "text-embedding-ada-002",
+  "embedding.embeddings.0.embedding.text": "First document",
+  "embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3]",
+  "embedding.embeddings.1.embedding.text": "Second document",
+  "embedding.embeddings.1.embedding.vector": "[0.4, 0.5, 0.6]",
+  "embedding.embeddings.2.embedding.text": "Third document",
+  "embedding.embeddings.2.embedding.vector": "[0.7, 0.8, 0.9]"
+}
+```
@@ -0,0 +1,51 @@
+# EVALUATOR Spans
+
+## Purpose
+
+EVALUATOR spans represent quality assessment operations (answer relevance, faithfulness, hallucination detection).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "EVALUATOR" | Yes |
+
+## Common Attributes
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Content being evaluated |
+| `output.value` | String | Evaluation result (score, label, explanation) |
+| `metadata.evaluator_name` | String | Evaluator identifier |
+| `metadata.score` | Float | Numeric score (0-1) |
+| `metadata.label` | String | Categorical label (relevant/irrelevant) |
+
+## Example: Answer Relevance
+
+```json
+{
+  "openinference.span.kind": "EVALUATOR",
+  "input.value": "{\"question\": \"What is the capital of France?\", \"answer\": \"The capital of France is Paris.\"}",
+  "input.mime_type": "application/json",
+  "output.value": "0.95",
+  "metadata.evaluator_name": "answer_relevance",
+  "metadata.score": 0.95,
+  "metadata.label": "relevant",
+  "metadata.explanation": "Answer directly addresses the question with correct information"
+}
+```
+
+## Example: Faithfulness Check
+
+```json
+{
+  "openinference.span.kind": "EVALUATOR",
+  "input.value": "{\"context\": \"Paris is in France.\", \"answer\": \"Paris is the capital of France.\"}",
+  "input.mime_type": "application/json",
+  "output.value": "0.5",
+  "metadata.evaluator_name": "faithfulness",
+  "metadata.score": 0.5,
+  "metadata.label": "partially_faithful",
+  "metadata.explanation": "Answer makes unsupported claim about Paris being the capital"
+}
+```
@@ -0,0 +1,49 @@
+# GUARDRAIL Spans
+
+## Purpose
+
+GUARDRAIL spans represent safety and policy checks (content moderation, PII detection, toxicity scoring).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "GUARDRAIL" | Yes |
+
+## Common Attributes
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Content being checked |
+| `output.value` | String | Guardrail result (allowed/blocked/flagged) |
+| `metadata.guardrail_type` | String | Type of check (toxicity, pii, bias) |
+| `metadata.score` | Float | Safety score (0-1) |
+| `metadata.threshold` | Float | Threshold for blocking |
+
+## Example: Content Moderation
+
+```json
+{
+  "openinference.span.kind": "GUARDRAIL",
+  "input.value": "User message: I want to build a bomb",
+  "output.value": "BLOCKED",
+  "metadata.guardrail_type": "content_moderation",
+  "metadata.score": 0.95,
+  "metadata.threshold": 0.7,
+  "metadata.categories": "[\"violence\", \"weapons\"]",
+  "metadata.action": "block_and_log"
+}
+```
+
+## Example: PII Detection
+
+```json
+{
+  "openinference.span.kind": "GUARDRAIL",
+  "input.value": "My SSN is 123-45-6789",
+  "output.value": "FLAGGED",
+  "metadata.guardrail_type": "pii_detection",
+  "metadata.detected_pii": "[\"ssn\"]",
+  "metadata.redacted_output": "My SSN is [REDACTED]"
+}
+```
@@ -0,0 +1,79 @@
+# LLM Spans
+
+Represent calls to language models (OpenAI, Anthropic, local models, etc.).
+
+## Required Attributes
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `openinference.span.kind` | String | Must be "LLM" |
+| `llm.model_name` | String | Model identifier (e.g., "gpt-4", "claude-3-5-sonnet-20241022") |
+
+## Key Attributes
+
+| Category | Attributes | Example |
+|----------|------------|---------|
+| **Model** | `llm.model_name`, `llm.provider` | "gpt-4-turbo", "openai" |
+| **Tokens** | `llm.token_count.prompt`, `llm.token_count.completion`, `llm.token_count.total` | 25, 8, 33 |
+| **Cost** | `llm.cost.prompt`, `llm.cost.completion`, `llm.cost.total` | 0.0021, 0.0045, 0.0066 |
+| **Parameters** | `llm.invocation_parameters` (JSON) | `{"temperature": 0.7, "max_tokens": 1024}` |
+| **Messages** | `llm.input_messages.{i}.*`, `llm.output_messages.{i}.*` | See examples below |
+| **Tools** | `llm.tools.{i}.tool.json_schema` | Function definitions |
+
+## Cost Tracking
+
+**Core attributes:**
+- `llm.cost.prompt` - Total input cost (USD)
+- `llm.cost.completion` - Total output cost (USD)
+- `llm.cost.total` - Total cost (USD)
+
+**Detailed cost breakdown:**
+- `llm.cost.prompt_details.{input,cache_read,cache_write,audio}` - Input cost components
+- `llm.cost.completion_details.{output,reasoning,audio}` - Output cost components
+
+## Messages
+
+**Input messages:**
+- `llm.input_messages.{i}.message.role` - "user", "assistant", "system", "tool"
+- `llm.input_messages.{i}.message.content` - Text content
+- `llm.input_messages.{i}.message.contents.{j}` - Multimodal (text + images)
+- `llm.input_messages.{i}.message.tool_calls` - Tool invocations
+
+**Output messages:** Same structure as input messages.
+
+## Example: Basic LLM Call
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "llm.model_name": "claude-3-5-sonnet-20241022",
+  "llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1024}",
+  "llm.input_messages.0.message.role": "system",
+  "llm.input_messages.0.message.content": "You are a helpful assistant.",
+  "llm.input_messages.1.message.role": "user",
+  "llm.input_messages.1.message.content": "What is the capital of France?",
+  "llm.output_messages.0.message.role": "assistant",
+  "llm.output_messages.0.message.content": "The capital of France is Paris.",
+  "llm.token_count.prompt": 25,
+  "llm.token_count.completion": 8,
+  "llm.token_count.total": 33
+}
+```
+
+## Example: LLM with Tool Calls
+
+```json
+{
+  "openinference.span.kind": "LLM",
+  "llm.model_name": "gpt-4-turbo",
+  "llm.input_messages.0.message.content": "What's the weather in SF?",
+  "llm.output_messages.0.message.tool_calls.0.tool_call.function.name": "get_weather",
+  "llm.output_messages.0.message.tool_calls.0.tool_call.function.arguments": "{\"location\": \"San Francisco\"}",
+  "llm.tools.0.tool.json_schema": "{\"type\": \"function\", \"function\": {\"name\": \"get_weather\"}}"
+}
+```
+
+## See Also
+
+- **Instrumentation:** `instrumentation-auto-python.md`, `instrumentation-manual-python.md`
+- **Full spec:** https://github.com/Arize-ai/openinference/blob/main/spec/semantic_conventions.md
@@ -0,0 +1,86 @@
+# RERANKER Spans
+
+## Purpose
+
+RERANKER spans represent reordering of retrieved documents (Cohere Rerank, cross-encoder models).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "RERANKER" | Yes |
+
+## Attribute Reference
+
+### Reranker Parameters
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `reranker.model_name` | String | Reranker model identifier |
+| `reranker.query` | String | Query used for reranking |
+| `reranker.top_k` | Integer | Number of documents to return |
+
+### Input Documents
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `reranker.input_documents.{i}.document.id` | String | Input document ID |
+| `reranker.input_documents.{i}.document.content` | String | Input document content |
+| `reranker.input_documents.{i}.document.score` | Float | Original retrieval score |
+| `reranker.input_documents.{i}.document.metadata` | String (JSON) | Document metadata |
+
+### Output Documents
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `reranker.output_documents.{i}.document.id` | String | Output document ID (reordered) |
+| `reranker.output_documents.{i}.document.content` | String | Output document content |
+| `reranker.output_documents.{i}.document.score` | Float | New reranker score |
+| `reranker.output_documents.{i}.document.metadata` | String (JSON) | Document metadata |
+
+### Score Comparison
+
+Input scores (from retriever) vs. output scores (from reranker):
+
+```json
+{
+  "reranker.input_documents.0.document.id": "doc_A",
+  "reranker.input_documents.0.document.score": 0.7,
+  "reranker.input_documents.1.document.id": "doc_B",
+  "reranker.input_documents.1.document.score": 0.9,
+  "reranker.output_documents.0.document.id": "doc_B",
+  "reranker.output_documents.0.document.score": 0.95,
+  "reranker.output_documents.1.document.id": "doc_A",
+  "reranker.output_documents.1.document.score": 0.85
+}
+```
+
+In this example:
+- Input: doc_B (0.9) ranked higher than doc_A (0.7)
+- Output: doc_B still highest but both scores increased
+- Reranker confirmed retriever's ordering but refined scores
+
+## Examples
+
+### Complete Reranking Example
+
+```json
+{
+  "openinference.span.kind": "RERANKER",
+  "reranker.model_name": "cohere-rerank-v2",
+  "reranker.query": "What is machine learning?",
+  "reranker.top_k": 2,
+  "reranker.input_documents.0.document.id": "doc_123",
+  "reranker.input_documents.0.document.content": "Machine learning is a subset...",
+  "reranker.input_documents.1.document.id": "doc_456",
+  "reranker.input_documents.1.document.content": "Supervised learning algorithms...",
+  "reranker.input_documents.2.document.id": "doc_789",
+  "reranker.input_documents.2.document.content": "Neural networks are...",
+  "reranker.output_documents.0.document.id": "doc_456",
+  "reranker.output_documents.0.document.content": "Supervised learning algorithms...",
+  "reranker.output_documents.0.document.score": 0.95,
+  "reranker.output_documents.1.document.id": "doc_123",
+  "reranker.output_documents.1.document.content": "Machine learning is a subset...",
+  "reranker.output_documents.1.document.score": 0.88
+}
+```
@@ -0,0 +1,110 @@
+# RETRIEVER Spans
+
+## Purpose
+
+RETRIEVER spans represent document/context retrieval operations (vector DB queries, semantic search, keyword search).
+
+## Required Attributes
+
+| Attribute | Type | Description | Required |
+|-----------|------|-------------|----------|
+| `openinference.span.kind` | String | Must be "RETRIEVER" | Yes |
+
+## Attribute Reference
+
+### Query
+
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `input.value` | String | Search query text |
+
+### Document Schema
+
+| Attribute Pattern | Type | Description |
+|-------------------|------|-------------|
+| `retrieval.documents.{i}.document.id` | String | Unique document identifier |
+| `retrieval.documents.{i}.document.content` | String | Document text content |
+| `retrieval.documents.{i}.document.score` | Float | Relevance score (0-1 or distance) |
+| `retrieval.documents.{i}.document.metadata` | String (JSON) | Document metadata |
+
+### Flattening Pattern for Documents
+
+Documents are flattened using zero-indexed notation:
+
+```
+retrieval.documents.0.document.id
+retrieval.documents.0.document.content
+retrieval.documents.0.document.score
+retrieval.documents.1.document.id
+retrieval.documents.1.document.content
+retrieval.documents.1.document.score
+...
+```
+
+### Document Metadata
+
+Common metadata fields (stored as JSON string):
+
+```json
+{
+  "source": "knowledge_base.pdf",
+  "page": 42,
+  "section": "Introduction",
+  "author": "Jane Doe",
+  "created_at": "2024-01-15",
+  "url": "https://example.com/doc",
+  "chunk_id": "chunk_123"
+}
+```
+
+**Example with metadata:**
+```json
+{
+  "retrieval.documents.0.document.id": "doc_123",
+  "retrieval.documents.0.document.content": "Machine learning is a method of data analysis...",
+  "retrieval.documents.0.document.score": 0.92,
+  "retrieval.documents.0.document.metadata": "{\"source\": \"ml_textbook.pdf\", \"page\": 15, \"chapter\": \"Introduction\"}"
+}
+```
+
+### Ordering
+
+Documents are ordered by index (0, 1, 2, ...). Typically:
+- Index 0 = highest scoring document
+- Index 1 = second highest
+- etc.
+
+Preserve retrieval order in your flattened attributes.
+
+### Large Document Handling
+
+For very long documents:
+- Consider truncating `document.content` to first N characters
+- Store full content in separate document store
+- Use `document.id` to reference full content
+
+## Examples
+
+### Basic Vector Search
+
+```json
+{
+  "openinference.span.kind": "RETRIEVER",
+  "input.value": "What is machine learning?",
+  "retrieval.documents.0.document.id": "doc_123",
+  "retrieval.documents.0.document.content": "Machine learning is a subset of artificial intelligence...",
+  "retrieval.documents.0.document.score": 0.92,
+  "retrieval.documents.0.document.metadata": "{\"source\": \"textbook.pdf\", \"page\": 42}",
+  "retrieval.documents.1.document.id": "doc_456",
+  "retrieval.documents.1.document.content": "Machine learning algorithms learn patterns from data...",
+  "retrieval.documents.1.document.score": 0.87,
+  "retrieval.documents.1.document.metadata": "{\"source\": \"article.html\", \"author\": \"Jane Doe\"}",
+  "retrieval.documents.2.document.id": "doc_789",
+  "retrieval.documents.2.document.content": "Supervised learning is a type of machine learning...",
+  "retrieval.documents.2.document.score": 0.81,
+  "retrieval.documents.2.document.metadata": "{\"source\": \"wiki.org\"}",
+  "metadata.retriever_type": "vector_search",
+  "metadata.vector_db": "pinecone",
+  "metadata.top_k": 3
+}
+```
@@ -0,0 +1,67 @@
+# TOOL Spans
+
+## Purpose
+
+TOOL spans represent external tool or function invocations (API calls, database queries, calculators, custom functions).
+
+## Required Attributes
+
+| Attribute                 | Type   | Description        | Required    |
+| ------------------------- | ------ | ------------------ | ----------- |
+| `openinference.span.kind` | String | Must be "TOOL"     | Yes         |
+| `tool.name`               | String | Tool/function name | Recommended |
+
+## Attribute Reference
+
+### Tool Execution Attributes
+
+| Attribute          | Type          | Description                                |
+| ------------------ | ------------- | ------------------------------------------ |
+| `tool.name`        | String        | Tool/function name                         |
+| `tool.description` | String        | Tool purpose/description                   |
+| `tool.parameters`  | String (JSON) | JSON schema defining the tool's parameters |
+| `input.value`      | String (JSON) | Actual input values passed to the tool     |
+| `output.value`     | String        | Tool output/result                         |
+| `output.mime_type` | String        | Result content type (e.g., "application/json") |
+
+## Examples
+
+### API Call Tool
+
+```json
+{
+  "openinference.span.kind": "TOOL",
+  "tool.name": "get_weather",
+  "tool.description": "Fetches current weather for a location",
+  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"location\": {\"type\": \"string\"}, \"units\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]}}, \"required\": [\"location\"]}",
+  "input.value": "{\"location\": \"San Francisco\", \"units\": \"celsius\"}",
+  "output.value": "{\"temperature\": 18, \"conditions\": \"partly cloudy\"}"
+}
+```
+
+### Calculator Tool
+
+```json
+{
+  "openinference.span.kind": "TOOL",
+  "tool.name": "calculator",
+  "tool.description": "Performs mathematical calculations",
+  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"expression\": {\"type\": \"string\", \"description\": \"Math expression to evaluate\"}}, \"required\": [\"expression\"]}",
+  "input.value": "{\"expression\": \"2 + 2\"}",
+  "output.value": "4"
+}
+```
+
+### Database Query Tool
+
+```json
+{
+  "openinference.span.kind": "TOOL",
+  "tool.name": "sql_query",
+  "tool.description": "Executes SQL query on user database",
+  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"query\": {\"type\": \"string\", \"description\": \"SQL query to execute\"}}, \"required\": [\"query\"]}",
+  "input.value": "{\"query\": \"SELECT * FROM users WHERE id = 123\"}",
+  "output.value": "[{\"id\": 123, \"name\": \"Alice\", \"email\": \"alice@example.com\"}]",
+  "output.mime_type": "application/json"
+}
+```