mirror of
https://github.com/github/awesome-copilot.git
synced 2026-04-13 11:45:56 +00:00
chore: publish from staged
This commit is contained in:
24
plugins/phoenix/skills/phoenix-tracing/README.md
Normal file
24
plugins/phoenix/skills/phoenix-tracing/README.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Phoenix Tracing Skill
|
||||
|
||||
OpenInference semantic conventions and instrumentation guides for Phoenix.
|
||||
|
||||
## Usage
|
||||
|
||||
Start with `SKILL.md` for the index and quick reference.
|
||||
|
||||
## File Organization
|
||||
|
||||
All files in flat `rules/` directory with semantic prefixes:
|
||||
|
||||
- `span-*` - Span kinds (LLM, CHAIN, TOOL, etc.)
|
||||
- `setup-*`, `instrumentation-*` - Getting started guides
|
||||
- `fundamentals-*`, `attributes-*` - Reference docs
|
||||
- `annotations-*`, `export-*` - Advanced features
|
||||
|
||||
## Reference
|
||||
|
||||
- [OpenInference Spec](https://github.com/Arize-ai/openinference/tree/main/spec)
|
||||
- [Phoenix Documentation](https://docs.arize.com/phoenix)
|
||||
- [Python OTEL API](https://arize-phoenix.readthedocs.io/projects/otel/en/latest/)
|
||||
- [Python Client API](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
|
||||
- [TypeScript API](https://arize-ai.github.io/phoenix/)
|
||||
139
plugins/phoenix/skills/phoenix-tracing/SKILL.md
Normal file
139
plugins/phoenix/skills/phoenix-tracing/SKILL.md
Normal file
@@ -0,0 +1,139 @@
|
||||
---
|
||||
name: phoenix-tracing
|
||||
description: OpenInference semantic conventions and instrumentation for Phoenix AI observability. Use when implementing LLM tracing, creating custom spans, or deploying to production.
|
||||
license: Apache-2.0
|
||||
compatibility: Requires Phoenix server. Python skills need arize-phoenix-otel; TypeScript skills need @arizeai/phoenix-otel.
|
||||
metadata:
|
||||
author: oss@arize.com
|
||||
version: "1.0.0"
|
||||
languages: "Python, TypeScript"
|
||||
---
|
||||
|
||||
# Phoenix Tracing
|
||||
|
||||
Comprehensive guide for instrumenting LLM applications with OpenInference tracing in Phoenix. Contains reference files covering setup, instrumentation, span types, and production deployment.
|
||||
|
||||
## When to Apply
|
||||
|
||||
Reference these guidelines when:
|
||||
|
||||
- Setting up Phoenix tracing (Python or TypeScript)
|
||||
- Creating custom spans for LLM operations
|
||||
- Adding attributes following OpenInference conventions
|
||||
- Deploying tracing to production
|
||||
- Querying and analyzing trace data
|
||||
|
||||
## Reference Categories
|
||||
|
||||
| Priority | Category | Description | Prefix |
|
||||
| -------- | --------------- | ------------------------------ | -------------------------- |
|
||||
| 1 | Setup | Installation and configuration | `setup-*` |
|
||||
| 2 | Instrumentation | Auto and manual tracing | `instrumentation-*` |
|
||||
| 3 | Span Types | 9 span kinds with attributes | `span-*` |
|
||||
| 4 | Organization | Projects and sessions | `projects-*`, `sessions-*` |
|
||||
| 5 | Enrichment | Custom metadata | `metadata-*` |
|
||||
| 6 | Production | Batch processing, masking | `production-*` |
|
||||
| 7 | Feedback | Annotations and evaluation | `annotations-*` |
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### 1. Setup (START HERE)
|
||||
|
||||
- [setup-python](references/setup-python.md) - Install arize-phoenix-otel, configure endpoint
|
||||
- [setup-typescript](references/setup-typescript.md) - Install @arizeai/phoenix-otel, configure endpoint
|
||||
|
||||
### 2. Instrumentation
|
||||
|
||||
- [instrumentation-auto-python](references/instrumentation-auto-python.md) - Auto-instrument OpenAI, LangChain, etc.
|
||||
- [instrumentation-auto-typescript](references/instrumentation-auto-typescript.md) - Auto-instrument supported frameworks
|
||||
- [instrumentation-manual-python](references/instrumentation-manual-python.md) - Custom spans with decorators
|
||||
- [instrumentation-manual-typescript](references/instrumentation-manual-typescript.md) - Custom spans with wrappers
|
||||
|
||||
### 3. Span Types (with full attribute schemas)
|
||||
|
||||
- [span-llm](references/span-llm.md) - LLM API calls (model, tokens, messages, cost)
|
||||
- [span-chain](references/span-chain.md) - Multi-step workflows and pipelines
|
||||
- [span-retriever](references/span-retriever.md) - Document retrieval (documents, scores)
|
||||
- [span-tool](references/span-tool.md) - Function/API calls (name, parameters)
|
||||
- [span-agent](references/span-agent.md) - Multi-step reasoning agents
|
||||
- [span-embedding](references/span-embedding.md) - Vector generation
|
||||
- [span-reranker](references/span-reranker.md) - Document re-ranking
|
||||
- [span-guardrail](references/span-guardrail.md) - Safety checks
|
||||
- [span-evaluator](references/span-evaluator.md) - LLM evaluation
|
||||
|
||||
### 4. Organization
|
||||
|
||||
- [projects-python](references/projects-python.md) / [projects-typescript](references/projects-typescript.md) - Group traces by application
|
||||
- [sessions-python](references/sessions-python.md) / [sessions-typescript](references/sessions-typescript.md) - Track conversations
|
||||
|
||||
### 5. Enrichment
|
||||
|
||||
- [metadata-python](references/metadata-python.md) / [metadata-typescript](references/metadata-typescript.md) - Custom attributes
|
||||
|
||||
### 6. Production (CRITICAL)
|
||||
|
||||
- [production-python](references/production-python.md) / [production-typescript](references/production-typescript.md) - Batch processing, PII masking
|
||||
|
||||
### 7. Feedback
|
||||
|
||||
- [annotations-overview](references/annotations-overview.md) - Feedback concepts
|
||||
- [annotations-python](references/annotations-python.md) / [annotations-typescript](references/annotations-typescript.md) - Add feedback to spans
|
||||
|
||||
### Reference Files
|
||||
|
||||
- [fundamentals-overview](references/fundamentals-overview.md) - Traces, spans, attributes basics
|
||||
- [fundamentals-required-attributes](references/fundamentals-required-attributes.md) - Required fields per span type
|
||||
- [fundamentals-universal-attributes](references/fundamentals-universal-attributes.md) - Common attributes (user.id, session.id)
|
||||
- [fundamentals-flattening](references/fundamentals-flattening.md) - JSON flattening rules
|
||||
- [attributes-messages](references/attributes-messages.md) - Chat message format
|
||||
- [attributes-metadata](references/attributes-metadata.md) - Custom metadata schema
|
||||
- [attributes-graph](references/attributes-graph.md) - Agent workflow attributes
|
||||
- [attributes-exceptions](references/attributes-exceptions.md) - Error tracking
|
||||
|
||||
## Common Workflows
|
||||
|
||||
- **Quick Start**: setup-{lang} → instrumentation-auto-{lang} → Check Phoenix
|
||||
- **Custom Spans**: setup-{lang} → instrumentation-manual-{lang} → span-{type}
|
||||
- **Session Tracking**: sessions-{lang} for conversation grouping patterns
|
||||
- **Production**: production-{lang} for batching, masking, and deployment
|
||||
|
||||
## How to Use This Skill
|
||||
|
||||
**Navigation Patterns:**
|
||||
|
||||
```bash
|
||||
# By category prefix
|
||||
references/setup-* # Installation and configuration
|
||||
references/instrumentation-* # Auto and manual tracing
|
||||
references/span-* # Span type specifications
|
||||
references/sessions-* # Session tracking
|
||||
references/production-* # Production deployment
|
||||
references/fundamentals-* # Core concepts
|
||||
references/attributes-* # Attribute specifications
|
||||
|
||||
# By language
|
||||
references/*-python.md # Python implementations
|
||||
references/*-typescript.md # TypeScript implementations
|
||||
```
|
||||
|
||||
**Reading Order:**
|
||||
1. Start with setup-{lang} for your language
|
||||
2. Choose instrumentation-auto-{lang} OR instrumentation-manual-{lang}
|
||||
3. Reference span-{type} files as needed for specific operations
|
||||
4. See fundamentals-* files for attribute specifications
|
||||
|
||||
## References
|
||||
|
||||
**Phoenix Documentation:**
|
||||
|
||||
- [Phoenix Documentation](https://docs.arize.com/phoenix)
|
||||
- [OpenInference Spec](https://github.com/Arize-ai/openinference/tree/main/spec)
|
||||
|
||||
**Python API Documentation:**
|
||||
|
||||
- [Python OTEL Package](https://arize-phoenix.readthedocs.io/projects/otel/en/latest/) - `arize-phoenix-otel` API reference
|
||||
- [Python Client Package](https://arize-phoenix.readthedocs.io/projects/client/en/latest/) - `arize-phoenix-client` API reference
|
||||
|
||||
**TypeScript API Documentation:**
|
||||
|
||||
- [TypeScript Packages](https://arize-ai.github.io/phoenix/) - `@arizeai/phoenix-otel`, `@arizeai/phoenix-client`, and other TypeScript packages
|
||||
@@ -0,0 +1,69 @@
|
||||
# Annotations Overview
|
||||
|
||||
Annotations allow you to add human or automated feedback to traces, spans, documents, and sessions. Annotations are essential for evaluation, quality assessment, and building training datasets.
|
||||
|
||||
## Annotation Types
|
||||
|
||||
Phoenix supports four types of annotations:
|
||||
|
||||
| Type | Target | Purpose | Example Use Case |
|
||||
| ----------------------- | -------------------------------- | ---------------------------------------- | -------------------------------- |
|
||||
| **Span Annotation** | Individual span | Feedback on a specific operation | "This LLM response was accurate" |
|
||||
| **Document Annotation** | Document within a RETRIEVER span | Feedback on retrieved document relevance | "This document was not helpful" |
|
||||
| **Trace Annotation** | Entire trace | Feedback on end-to-end interaction | "User was satisfied with result" |
|
||||
| **Session Annotation** | User session | Feedback on multi-turn conversation | "Session ended successfully" |
|
||||
|
||||
## Annotation Fields
|
||||
|
||||
Every annotation has these fields:
|
||||
|
||||
### Required Fields
|
||||
|
||||
| Field | Type | Description |
|
||||
| --------- | ------ | ----------------------------------------------------------------------------- |
|
||||
| Entity ID | String | ID of the target entity (span_id, trace_id, session_id, or document_position) |
|
||||
| `name` | String | Annotation name/label (e.g., "quality", "relevance", "helpfulness") |
|
||||
|
||||
### Result Fields (At Least One Required)
|
||||
|
||||
| Field | Type | Description |
|
||||
| ------------- | ----------------- | ----------------------------------------------------------------- |
|
||||
| `label` | String (optional) | Categorical value (e.g., "good", "bad", "relevant", "irrelevant") |
|
||||
| `score` | Float (optional) | Numeric value (typically 0-1, but can be any range) |
|
||||
| `explanation` | String (optional) | Free-text explanation of the annotation |
|
||||
|
||||
**At least one** of `label`, `score`, or `explanation` must be provided.
|
||||
|
||||
### Optional Fields
|
||||
|
||||
| Field | Type | Description |
|
||||
| ---------------- | ------ | --------------------------------------------------------------------------------------- |
|
||||
| `annotator_kind` | String | Who created this annotation: "HUMAN", "LLM", or "CODE" (default: "HUMAN") |
|
||||
| `identifier` | String | Unique identifier for upsert behavior (updates existing if same name+entity+identifier) |
|
||||
| `metadata` | Object | Custom metadata as key-value pairs |
|
||||
|
||||
## Annotator Kinds
|
||||
|
||||
| Kind | Description | Example |
|
||||
| ------- | ------------------------------ | --------------------------------- |
|
||||
| `HUMAN` | Manual feedback from a person | User ratings, expert labels |
|
||||
| `LLM` | Automated feedback from an LLM | GPT-4 evaluating response quality |
|
||||
| `CODE` | Automated feedback from code | Rule-based checks, heuristics |
|
||||
|
||||
## Examples
|
||||
|
||||
**Quality Assessment:**
|
||||
|
||||
- `quality` - Overall quality (label: good/fair/poor, score: 0-1)
|
||||
- `correctness` - Factual accuracy (label: correct/incorrect, score: 0-1)
|
||||
- `helpfulness` - User satisfaction (label: helpful/not_helpful, score: 0-1)
|
||||
|
||||
**RAG-Specific:**
|
||||
|
||||
- `relevance` - Document relevance to query (label: relevant/irrelevant, score: 0-1)
|
||||
- `faithfulness` - Answer grounded in context (label: faithful/unfaithful, score: 0-1)
|
||||
|
||||
**Safety:**
|
||||
|
||||
- `toxicity` - Contains harmful content (score: 0-1)
|
||||
- `pii_detected` - Contains personally identifiable information (label: yes/no)
|
||||
@@ -0,0 +1,114 @@
|
||||
# Python SDK Annotation Patterns
|
||||
|
||||
Add feedback to spans, traces, documents, and sessions using the Python client.
|
||||
|
||||
## Client Setup
|
||||
|
||||
```python
|
||||
from phoenix.client import Client
|
||||
client = Client() # Default: http://localhost:6006
|
||||
```
|
||||
|
||||
## Span Annotations
|
||||
|
||||
Add feedback to individual spans:
|
||||
|
||||
```python
|
||||
client.spans.add_span_annotation(
|
||||
span_id="abc123",
|
||||
annotation_name="quality",
|
||||
annotator_kind="HUMAN",
|
||||
label="high_quality",
|
||||
score=0.95,
|
||||
explanation="Accurate and well-formatted",
|
||||
metadata={"reviewer": "alice"},
|
||||
sync=True
|
||||
)
|
||||
```
|
||||
|
||||
## Document Annotations
|
||||
|
||||
Rate individual documents in RETRIEVER spans:
|
||||
|
||||
```python
|
||||
client.spans.add_document_annotation(
|
||||
span_id="retriever_span",
|
||||
document_position=0, # 0-based index
|
||||
annotation_name="relevance",
|
||||
annotator_kind="LLM",
|
||||
label="relevant",
|
||||
score=0.95
|
||||
)
|
||||
```
|
||||
|
||||
## Trace Annotations
|
||||
|
||||
Feedback on entire traces:
|
||||
|
||||
```python
|
||||
client.traces.add_trace_annotation(
|
||||
trace_id="trace_abc",
|
||||
annotation_name="correctness",
|
||||
annotator_kind="HUMAN",
|
||||
label="correct",
|
||||
score=1.0
|
||||
)
|
||||
```
|
||||
|
||||
## Session Annotations
|
||||
|
||||
Feedback on multi-turn conversations:
|
||||
|
||||
```python
|
||||
client.sessions.add_session_annotation(
|
||||
session_id="session_xyz",
|
||||
annotation_name="user_satisfaction",
|
||||
annotator_kind="HUMAN",
|
||||
label="satisfied",
|
||||
score=0.85
|
||||
)
|
||||
```
|
||||
|
||||
## RAG Pipeline Example
|
||||
|
||||
```python
|
||||
from phoenix.client import Client
|
||||
from phoenix.client.resources.spans import SpanDocumentAnnotationData
|
||||
|
||||
client = Client()
|
||||
|
||||
# Document relevance (batch)
|
||||
client.spans.log_document_annotations(
|
||||
document_annotations=[
|
||||
SpanDocumentAnnotationData(
|
||||
name="relevance", span_id="retriever_span", document_position=i,
|
||||
annotator_kind="LLM", result={"label": label, "score": score}
|
||||
)
|
||||
for i, (label, score) in enumerate([
|
||||
("relevant", 0.95), ("relevant", 0.80), ("irrelevant", 0.10)
|
||||
])
|
||||
]
|
||||
)
|
||||
|
||||
# LLM response quality
|
||||
client.spans.add_span_annotation(
|
||||
span_id="llm_span",
|
||||
annotation_name="faithfulness",
|
||||
annotator_kind="LLM",
|
||||
label="faithful",
|
||||
score=0.90
|
||||
)
|
||||
|
||||
# Overall trace quality
|
||||
client.traces.add_trace_annotation(
|
||||
trace_id="trace_123",
|
||||
annotation_name="correctness",
|
||||
annotator_kind="HUMAN",
|
||||
label="correct",
|
||||
score=1.0
|
||||
)
|
||||
```
|
||||
|
||||
## API Reference
|
||||
|
||||
- [Python Client API](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
|
||||
@@ -0,0 +1,137 @@
|
||||
# TypeScript SDK Annotation Patterns
|
||||
|
||||
Add feedback to spans, traces, documents, and sessions using the TypeScript client.
|
||||
|
||||
## Client Setup
|
||||
|
||||
```typescript
|
||||
import { createClient } from "phoenix-client";
|
||||
const client = createClient(); // Default: http://localhost:6006
|
||||
```
|
||||
|
||||
## Span Annotations
|
||||
|
||||
Add feedback to individual spans:
|
||||
|
||||
```typescript
|
||||
import { addSpanAnnotation } from "phoenix-client";
|
||||
|
||||
await addSpanAnnotation({
|
||||
client,
|
||||
spanAnnotation: {
|
||||
spanId: "abc123",
|
||||
name: "quality",
|
||||
annotatorKind: "HUMAN",
|
||||
label: "high_quality",
|
||||
score: 0.95,
|
||||
explanation: "Accurate and well-formatted",
|
||||
metadata: { reviewer: "alice" }
|
||||
},
|
||||
sync: true
|
||||
});
|
||||
```
|
||||
|
||||
## Document Annotations
|
||||
|
||||
Rate individual documents in RETRIEVER spans:
|
||||
|
||||
```typescript
|
||||
import { addDocumentAnnotation } from "phoenix-client";
|
||||
|
||||
await addDocumentAnnotation({
|
||||
client,
|
||||
documentAnnotation: {
|
||||
spanId: "retriever_span",
|
||||
documentPosition: 0, // 0-based index
|
||||
name: "relevance",
|
||||
annotatorKind: "LLM",
|
||||
label: "relevant",
|
||||
score: 0.95
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## Trace Annotations
|
||||
|
||||
Feedback on entire traces:
|
||||
|
||||
```typescript
|
||||
import { addTraceAnnotation } from "phoenix-client";
|
||||
|
||||
await addTraceAnnotation({
|
||||
client,
|
||||
traceAnnotation: {
|
||||
traceId: "trace_abc",
|
||||
name: "correctness",
|
||||
annotatorKind: "HUMAN",
|
||||
label: "correct",
|
||||
score: 1.0
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## Session Annotations
|
||||
|
||||
Feedback on multi-turn conversations:
|
||||
|
||||
```typescript
|
||||
import { addSessionAnnotation } from "phoenix-client";
|
||||
|
||||
await addSessionAnnotation({
|
||||
client,
|
||||
sessionAnnotation: {
|
||||
sessionId: "session_xyz",
|
||||
name: "user_satisfaction",
|
||||
annotatorKind: "HUMAN",
|
||||
label: "satisfied",
|
||||
score: 0.85
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## RAG Pipeline Example
|
||||
|
||||
```typescript
|
||||
import { createClient, logDocumentAnnotations, addSpanAnnotation, addTraceAnnotation } from "phoenix-client";
|
||||
|
||||
const client = createClient();
|
||||
|
||||
// Document relevance (batch)
|
||||
await logDocumentAnnotations({
|
||||
client,
|
||||
documentAnnotations: [
|
||||
{ spanId: "retriever_span", documentPosition: 0, name: "relevance",
|
||||
annotatorKind: "LLM", label: "relevant", score: 0.95 },
|
||||
{ spanId: "retriever_span", documentPosition: 1, name: "relevance",
|
||||
annotatorKind: "LLM", label: "relevant", score: 0.80 }
|
||||
]
|
||||
});
|
||||
|
||||
// LLM response quality
|
||||
await addSpanAnnotation({
|
||||
client,
|
||||
spanAnnotation: {
|
||||
spanId: "llm_span",
|
||||
name: "faithfulness",
|
||||
annotatorKind: "LLM",
|
||||
label: "faithful",
|
||||
score: 0.90
|
||||
}
|
||||
});
|
||||
|
||||
// Overall trace quality
|
||||
await addTraceAnnotation({
|
||||
client,
|
||||
traceAnnotation: {
|
||||
traceId: "trace_123",
|
||||
name: "correctness",
|
||||
annotatorKind: "HUMAN",
|
||||
label: "correct",
|
||||
score: 1.0
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## API Reference
|
||||
|
||||
- [TypeScript Client API](https://arize-ai.github.io/phoenix/)
|
||||
@@ -0,0 +1,58 @@
|
||||
# Flattening Convention
|
||||
|
||||
OpenInference flattens nested data structures into dot-notation attributes for database compatibility, OpenTelemetry compatibility, and simple querying.
|
||||
|
||||
## Flattening Rules
|
||||
|
||||
**Objects → Dot Notation**
|
||||
|
||||
```javascript
|
||||
{ llm: { model_name: "gpt-4", token_count: { prompt: 10, completion: 20 } } }
|
||||
// becomes
|
||||
{ "llm.model_name": "gpt-4", "llm.token_count.prompt": 10, "llm.token_count.completion": 20 }
|
||||
```
|
||||
|
||||
**Arrays → Zero-Indexed Notation**
|
||||
|
||||
```javascript
|
||||
{ llm: { input_messages: [{ role: "user", content: "Hi" }] } }
|
||||
// becomes
|
||||
{ "llm.input_messages.0.message.role": "user", "llm.input_messages.0.message.content": "Hi" }
|
||||
```
|
||||
|
||||
**Message Convention: `.message.` segment required**
|
||||
|
||||
```
|
||||
llm.input_messages.{index}.message.{field}
|
||||
llm.input_messages.0.message.tool_calls.0.tool_call.function.name
|
||||
```
|
||||
|
||||
## Complete Example
|
||||
|
||||
```javascript
|
||||
// Original
|
||||
{
|
||||
openinference: { span: { kind: "LLM" } },
|
||||
llm: {
|
||||
model_name: "claude-3-5-sonnet-20241022",
|
||||
invocation_parameters: { temperature: 0.7, max_tokens: 1000 },
|
||||
input_messages: [{ role: "user", content: "Tell me a joke" }],
|
||||
output_messages: [{ role: "assistant", content: "Why did the chicken cross the road?" }],
|
||||
token_count: { prompt: 5, completion: 10, total: 15 }
|
||||
}
|
||||
}
|
||||
|
||||
// Flattened (stored in Phoenix spans.attributes JSONB)
|
||||
{
|
||||
"openinference.span.kind": "LLM",
|
||||
"llm.model_name": "claude-3-5-sonnet-20241022",
|
||||
"llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1000}",
|
||||
"llm.input_messages.0.message.role": "user",
|
||||
"llm.input_messages.0.message.content": "Tell me a joke",
|
||||
"llm.output_messages.0.message.role": "assistant",
|
||||
"llm.output_messages.0.message.content": "Why did the chicken cross the road?",
|
||||
"llm.token_count.prompt": 5,
|
||||
"llm.token_count.completion": 10,
|
||||
"llm.token_count.total": 15
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,53 @@
|
||||
# Overview and Traces & Spans
|
||||
|
||||
This document covers the fundamental concepts of OpenInference traces and spans in Phoenix.
|
||||
|
||||
## Overview
|
||||
|
||||
OpenInference is a set of semantic conventions for AI and LLM applications based on OpenTelemetry. Phoenix uses these conventions to capture, store, and analyze traces from AI applications.
|
||||
|
||||
**Key Concepts:**
|
||||
|
||||
- **Traces** represent end-to-end requests through your application
|
||||
- **Spans** represent individual operations within a trace (LLM calls, retrievals, tool invocations)
|
||||
- **Attributes** are key-value pairs attached to spans using flattened, dot-notation paths
|
||||
- **Span Kinds** categorize the type of operation (LLM, RETRIEVER, TOOL, etc.)
|
||||
|
||||
## Traces and Spans
|
||||
|
||||
### Trace Hierarchy
|
||||
|
||||
A **trace** is a tree of **spans** representing a complete request:
|
||||
|
||||
```
|
||||
Trace ID: abc123
|
||||
├─ Span 1: CHAIN (root span, parent_id = null)
|
||||
│ ├─ Span 2: RETRIEVER (parent_id = span_1_id)
|
||||
│ │ └─ Span 3: EMBEDDING (parent_id = span_2_id)
|
||||
│ └─ Span 4: LLM (parent_id = span_1_id)
|
||||
│ └─ Span 5: TOOL (parent_id = span_4_id)
|
||||
```
|
||||
|
||||
### Context Propagation
|
||||
|
||||
Spans maintain parent-child relationships via:
|
||||
|
||||
- `trace_id` - Same for all spans in a trace
|
||||
- `span_id` - Unique identifier for this span
|
||||
- `parent_id` - References parent span's `span_id` (null for root spans)
|
||||
|
||||
Phoenix uses these relationships to:
|
||||
|
||||
- Build the span tree visualization in the UI
|
||||
- Calculate cumulative metrics (tokens, errors) up the tree
|
||||
- Enable nested querying (e.g., "find CHAIN spans containing LLM spans with errors")
|
||||
|
||||
### Span Lifecycle
|
||||
|
||||
Each span has:
|
||||
|
||||
- `start_time` - When the operation began (Unix timestamp in nanoseconds)
|
||||
- `end_time` - When the operation completed
|
||||
- `status_code` - OK, ERROR, or UNSET
|
||||
- `status_message` - Optional error message
|
||||
- `attributes` - object with all semantic convention attributes
|
||||
@@ -0,0 +1,64 @@
|
||||
# Required and Recommended Attributes
|
||||
|
||||
This document covers the required attribute and highly recommended attributes for all OpenInference spans.
|
||||
|
||||
## Required Attribute
|
||||
|
||||
**Every span MUST have exactly one required attribute:**
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "LLM"
|
||||
}
|
||||
```
|
||||
|
||||
## Highly Recommended Attributes
|
||||
|
||||
While not strictly required, these attributes are **highly recommended** on all spans as they:
|
||||
- Enable evaluation and quality assessment
|
||||
- Help understand information flow through your application
|
||||
- Make traces more useful for debugging
|
||||
|
||||
### Input/Output Values
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `input.value` | String | Input to the operation (prompt, query, document) |
|
||||
| `output.value` | String | Output from the operation (response, result, answer) |
|
||||
|
||||
**Example:**
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "LLM",
|
||||
"input.value": "What is the capital of France?",
|
||||
"output.value": "The capital of France is Paris."
|
||||
}
|
||||
```
|
||||
|
||||
**Why these matter:**
|
||||
- **Evaluations**: Many evaluators (faithfulness, relevance, hallucination detection) require both input and output to assess quality
|
||||
- **Information flow**: Seeing inputs/outputs makes it easy to trace how data transforms through your application
|
||||
- **Debugging**: When something goes wrong, having the actual input/output makes root cause analysis much faster
|
||||
- **Analytics**: Enables pattern analysis across similar inputs or outputs
|
||||
|
||||
**Phoenix Behavior:**
|
||||
- Input/output displayed prominently in span details
|
||||
- Evaluators can automatically access these values
|
||||
- Search/filter traces by input or output content
|
||||
- Export inputs/outputs for fine-tuning datasets
|
||||
|
||||
## Valid Span Kinds
|
||||
|
||||
There are exactly **9 valid span kinds** in OpenInference:
|
||||
|
||||
| Span Kind | Purpose | Common Use Case |
|
||||
|-----------|---------|-----------------|
|
||||
| `LLM` | Language model inference | OpenAI, Anthropic, local LLM calls |
|
||||
| `EMBEDDING` | Vector generation | Text-to-vector conversion |
|
||||
| `CHAIN` | Application flow orchestration | LangChain chains, custom workflows |
|
||||
| `RETRIEVER` | Document/context retrieval | Vector DB queries, semantic search |
|
||||
| `RERANKER` | Result reordering | Rerank retrieved documents |
|
||||
| `TOOL` | External tool invocation | API calls, function execution |
|
||||
| `AGENT` | Autonomous reasoning | ReAct agents, planning loops |
|
||||
| `GUARDRAIL` | Safety/policy checks | Content moderation, PII detection |
|
||||
| `EVALUATOR` | Quality assessment | Answer relevance, faithfulness scoring |
|
||||
@@ -0,0 +1,72 @@
|
||||
# Universal Attributes
|
||||
|
||||
This document covers attributes that can be used on any span kind in OpenInference.
|
||||
|
||||
## Overview
|
||||
|
||||
These attributes can be used on **any span kind** to provide additional context, tracking, and metadata.
|
||||
|
||||
## Input/Output
|
||||
|
||||
| Attribute | Type | Description |
|
||||
| ------------------ | ------ | ---------------------------------------------------- |
|
||||
| `input.value` | String | Input to the operation (prompt, query, document) |
|
||||
| `input.mime_type` | String | MIME type (e.g., "text/plain", "application/json") |
|
||||
| `output.value` | String | Output from the operation (response, vector, result) |
|
||||
| `output.mime_type` | String | MIME type of output |
|
||||
|
||||
### Why Capture I/O?
|
||||
|
||||
**Always capture input/output for evaluation-ready spans:**
|
||||
- Phoenix evaluators (faithfulness, relevance, Q&A correctness) require `input.value` and `output.value`
|
||||
- Phoenix UI displays I/O prominently in trace views for debugging
|
||||
- Enables exporting I/O for creating fine-tuning datasets
|
||||
- Provides complete context for analyzing agent behavior
|
||||
|
||||
**Example attributes:**
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "CHAIN",
|
||||
"input.value": "What is the weather?",
|
||||
"input.mime_type": "text/plain",
|
||||
"output.value": "I don't have access to weather data.",
|
||||
"output.mime_type": "text/plain"
|
||||
}
|
||||
```
|
||||
|
||||
**See language-specific implementation:**
|
||||
- TypeScript: `instrumentation-manual-typescript.md`
|
||||
- Python: `instrumentation-manual-python.md`
|
||||
|
||||
## Session and User Tracking
|
||||
|
||||
| Attribute | Type | Description |
|
||||
| ------------ | ------ | ---------------------------------------------- |
|
||||
| `session.id` | String | Session identifier for grouping related traces |
|
||||
| `user.id` | String | User identifier for per-user analysis |
|
||||
|
||||
**Example:**
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "LLM",
|
||||
"session.id": "session_abc123",
|
||||
"user.id": "user_xyz789"
|
||||
}
|
||||
```
|
||||
|
||||
## Metadata
|
||||
|
||||
| Attribute | Type | Description |
|
||||
| ---------- | ------ | ------------------------------------------ |
|
||||
| `metadata` | string | JSON-serialized object of key-value pairs |
|
||||
|
||||
**Example:**
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "LLM",
|
||||
"metadata": "{\"environment\": \"production\", \"model_version\": \"v2.1\", \"cost_center\": \"engineering\"}"
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,85 @@
|
||||
# Phoenix Tracing: Auto-Instrumentation (Python)
|
||||
|
||||
**Automatically create spans for LLM calls without code changes.**
|
||||
|
||||
## Overview
|
||||
|
||||
Auto-instrumentation patches supported libraries at runtime to create spans automatically. Use for supported frameworks (LangChain, LlamaIndex, OpenAI SDK, etc.). For custom logic, manual-instrumentation-python.md.
|
||||
|
||||
## Supported Frameworks
|
||||
|
||||
**Python:**
|
||||
|
||||
- LLM SDKs: OpenAI, Anthropic, Bedrock, Mistral, Vertex AI, Groq, Ollama
|
||||
- Frameworks: LangChain, LlamaIndex, DSPy, CrewAI, Instructor, Haystack
|
||||
- Install: `pip install openinference-instrumentation-{name}`
|
||||
|
||||
## Setup
|
||||
|
||||
**Install and enable:**
|
||||
|
||||
```bash
|
||||
pip install arize-phoenix-otel
|
||||
pip install openinference-instrumentation-openai # Add others as needed
|
||||
```
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
|
||||
register(project_name="my-app", auto_instrument=True) # Discovers all installed instrumentors
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
from openai import OpenAI
|
||||
|
||||
register(project_name="my-app", auto_instrument=True)
|
||||
|
||||
client = OpenAI()
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-4",
|
||||
messages=[{"role": "user", "content": "Hello!"}]
|
||||
)
|
||||
```
|
||||
|
||||
Traces appear in Phoenix UI with model, input/output, tokens, timing automatically captured. See span kind files for full attribute schemas.
|
||||
|
||||
**Selective instrumentation** (explicit control):
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
from openinference.instrumentation.openai import OpenAIInstrumentor
|
||||
|
||||
tracer_provider = register(project_name="my-app") # No auto_instrument
|
||||
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
Auto-instrumentation does NOT capture:
|
||||
|
||||
- Custom business logic
|
||||
- Internal function calls
|
||||
|
||||
**Example:**
|
||||
|
||||
```python
|
||||
def my_custom_workflow(query: str) -> str:
|
||||
preprocessed = preprocess(query) # Not traced
|
||||
response = client.chat.completions.create(...) # Traced (auto)
|
||||
postprocessed = postprocess(response) # Not traced
|
||||
return postprocessed
|
||||
```
|
||||
|
||||
**Solution:** Add manual instrumentation:
|
||||
|
||||
```python
|
||||
@tracer.chain
|
||||
def my_custom_workflow(query: str) -> str:
|
||||
preprocessed = preprocess(query)
|
||||
response = client.chat.completions.create(...)
|
||||
postprocessed = postprocess(response)
|
||||
return postprocessed
|
||||
```
|
||||
@@ -0,0 +1,87 @@
|
||||
# Auto-Instrumentation (TypeScript)
|
||||
|
||||
Automatically create spans for LLM calls without code changes.
|
||||
|
||||
## Supported Frameworks
|
||||
|
||||
- **LLM SDKs:** OpenAI
|
||||
- **Frameworks:** LangChain
|
||||
- **Install:** `npm install @arizeai/openinference-instrumentation-{name}`
|
||||
|
||||
## Setup
|
||||
|
||||
**CommonJS (automatic):**
|
||||
|
||||
```javascript
|
||||
const { register } = require("@arizeai/phoenix-otel");
|
||||
const OpenAI = require("openai");
|
||||
|
||||
register({ projectName: "my-app" });
|
||||
|
||||
const client = new OpenAI();
|
||||
```
|
||||
|
||||
**ESM (manual required):**
|
||||
|
||||
```typescript
|
||||
import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
|
||||
import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
|
||||
import OpenAI from "openai";
|
||||
|
||||
register({ projectName: "my-app" });
|
||||
|
||||
const instrumentation = new OpenAIInstrumentation();
|
||||
instrumentation.manuallyInstrument(OpenAI);
|
||||
registerInstrumentations({ instrumentations: [instrumentation] });
|
||||
```
|
||||
|
||||
**Why:** ESM imports are hoisted before `register()` runs.
|
||||
|
||||
## Limitations
|
||||
|
||||
**What auto-instrumentation does NOT capture:**
|
||||
|
||||
```typescript
|
||||
async function myWorkflow(query: string): Promise<string> {
|
||||
const preprocessed = await preprocess(query); // Not traced
|
||||
const response = await client.chat.completions.create(...); // Traced (auto)
|
||||
const postprocessed = await postprocess(response); // Not traced
|
||||
return postprocessed;
|
||||
}
|
||||
```
|
||||
|
||||
**Solution:** Add manual instrumentation for custom logic:
|
||||
|
||||
```typescript
|
||||
import { traceChain } from "@arizeai/openinference-core";
|
||||
|
||||
const myWorkflow = traceChain(
|
||||
async (query: string): Promise<string> => {
|
||||
const preprocessed = await preprocess(query);
|
||||
const response = await client.chat.completions.create(...);
|
||||
const postprocessed = await postprocess(response);
|
||||
return postprocessed;
|
||||
},
|
||||
{ name: "my-workflow" }
|
||||
);
|
||||
```
|
||||
|
||||
## Combining Auto + Manual
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
import { traceChain } from "@arizeai/openinference-core";
|
||||
|
||||
register({ projectName: "my-app" });
|
||||
|
||||
const client = new OpenAI();
|
||||
|
||||
const workflow = traceChain(
|
||||
async (query: string) => {
|
||||
const preprocessed = await preprocess(query);
|
||||
const response = await client.chat.completions.create(...); // Auto-instrumented
|
||||
return postprocess(response);
|
||||
},
|
||||
{ name: "my-workflow" }
|
||||
);
|
||||
```
|
||||
@@ -0,0 +1,182 @@
|
||||
# Manual Instrumentation (Python)
|
||||
|
||||
Add custom spans using decorators or context managers for fine-grained tracing control.
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
pip install arize-phoenix-otel
|
||||
```
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
tracer_provider = register(project_name="my-app")
|
||||
tracer = tracer_provider.get_tracer(__name__)
|
||||
```
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Span Kind | Decorator | Use Case |
|
||||
|-----------|-----------|----------|
|
||||
| CHAIN | `@tracer.chain` | Orchestration, workflows, pipelines |
|
||||
| RETRIEVER | `@tracer.retriever` | Vector search, document retrieval |
|
||||
| TOOL | `@tracer.tool` | External API calls, function execution |
|
||||
| AGENT | `@tracer.agent` | Multi-step reasoning, planning |
|
||||
| LLM | `@tracer.llm` | LLM API calls (manual only) |
|
||||
| EMBEDDING | `@tracer.embedding` | Embedding generation |
|
||||
| RERANKER | `@tracer.reranker` | Document re-ranking |
|
||||
| GUARDRAIL | `@tracer.guardrail` | Safety checks, content moderation |
|
||||
| EVALUATOR | `@tracer.evaluator` | LLM evaluation, quality checks |
|
||||
|
||||
## Decorator Approach (Recommended)
|
||||
|
||||
**Use for:** Full function instrumentation, automatic I/O capture
|
||||
|
||||
```python
|
||||
@tracer.chain
|
||||
def rag_pipeline(query: str) -> str:
|
||||
docs = retrieve_documents(query)
|
||||
ranked = rerank(docs, query)
|
||||
return generate_response(ranked, query)
|
||||
|
||||
@tracer.retriever
|
||||
def retrieve_documents(query: str) -> list[dict]:
|
||||
results = vector_db.search(query, top_k=5)
|
||||
return [{"content": doc.text, "score": doc.score} for doc in results]
|
||||
|
||||
@tracer.tool
|
||||
def get_weather(city: str) -> str:
|
||||
response = requests.get(f"https://api.weather.com/{city}")
|
||||
return response.json()["weather"]
|
||||
```
|
||||
|
||||
**Custom span names:**
|
||||
|
||||
```python
|
||||
@tracer.chain(name="rag-pipeline-v2")
|
||||
def my_workflow(query: str) -> str:
|
||||
return process(query)
|
||||
```
|
||||
|
||||
## Context Manager Approach
|
||||
|
||||
**Use for:** Partial function instrumentation, custom attributes, dynamic control
|
||||
|
||||
```python
|
||||
from opentelemetry.trace import Status, StatusCode
|
||||
import json
|
||||
|
||||
def retrieve_with_metadata(query: str):
|
||||
with tracer.start_as_current_span(
|
||||
"vector_search",
|
||||
openinference_span_kind="retriever"
|
||||
) as span:
|
||||
span.set_attribute("input.value", query)
|
||||
|
||||
results = vector_db.search(query, top_k=5)
|
||||
|
||||
documents = [
|
||||
{
|
||||
"document.id": doc.id,
|
||||
"document.content": doc.text,
|
||||
"document.score": doc.score
|
||||
}
|
||||
for doc in results
|
||||
]
|
||||
span.set_attribute("retrieval.documents", json.dumps(documents))
|
||||
span.set_status(Status(StatusCode.OK))
|
||||
|
||||
return documents
|
||||
```
|
||||
|
||||
## Capturing Input/Output
|
||||
|
||||
**Always capture I/O for evaluation-ready spans.**
|
||||
|
||||
### Automatic I/O Capture (Decorators)
|
||||
|
||||
Decorators automatically capture input arguments and return values:
|
||||
|
||||
```python theme={null}
|
||||
@tracer.chain
|
||||
def handle_query(user_input: str) -> str:
|
||||
result = agent.generate(user_input)
|
||||
return result.text
|
||||
|
||||
# Automatically captures:
|
||||
# - input.value: user_input
|
||||
# - output.value: result.text
|
||||
# - input.mime_type / output.mime_type: auto-detected
|
||||
```
|
||||
|
||||
### Manual I/O Capture (Context Manager)
|
||||
|
||||
Use `set_input()` and `set_output()` for simple I/O capture:
|
||||
|
||||
```python theme={null}
|
||||
from opentelemetry.trace import Status, StatusCode
|
||||
|
||||
def handle_query(user_input: str) -> str:
|
||||
with tracer.start_as_current_span(
|
||||
"query.handler",
|
||||
openinference_span_kind="chain"
|
||||
) as span:
|
||||
span.set_input(user_input)
|
||||
|
||||
result = agent.generate(user_input)
|
||||
|
||||
span.set_output(result.text)
|
||||
span.set_status(Status(StatusCode.OK))
|
||||
|
||||
return result.text
|
||||
```
|
||||
|
||||
**What gets captured:**
|
||||
|
||||
```json
|
||||
{
|
||||
"input.value": "What is 2+2?",
|
||||
"input.mime_type": "text/plain",
|
||||
"output.value": "2+2 equals 4.",
|
||||
"output.mime_type": "text/plain"
|
||||
}
|
||||
```
|
||||
|
||||
**Why this matters:**
|
||||
- Phoenix evaluators require `input.value` and `output.value`
|
||||
- Phoenix UI displays I/O prominently for debugging
|
||||
- Enables exporting data for fine-tuning datasets
|
||||
|
||||
### Custom I/O with Additional Metadata
|
||||
|
||||
Use `set_attribute()` for custom attributes alongside I/O:
|
||||
|
||||
```python theme={null}
|
||||
def process_query(query: str):
|
||||
with tracer.start_as_current_span(
|
||||
"query.process",
|
||||
openinference_span_kind="chain"
|
||||
) as span:
|
||||
# Standard I/O
|
||||
span.set_input(query)
|
||||
|
||||
# Custom metadata
|
||||
span.set_attribute("input.length", len(query))
|
||||
|
||||
result = llm.generate(query)
|
||||
|
||||
# Standard output
|
||||
span.set_output(result.text)
|
||||
|
||||
# Custom metadata
|
||||
span.set_attribute("output.tokens", result.usage.total_tokens)
|
||||
span.set_status(Status(StatusCode.OK))
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- **Span attributes:** `span-chain.md`, `span-retriever.md`, `span-tool.md`, `span-llm.md`, `span-agent.md`, `span-embedding.md`, `span-reranker.md`, `span-guardrail.md`, `span-evaluator.md`
|
||||
- **Auto-instrumentation:** `instrumentation-auto-python.md` for framework integrations
|
||||
- **API docs:** https://docs.arize.com/phoenix/tracing/manual-instrumentation
|
||||
@@ -0,0 +1,172 @@
|
||||
# Manual Instrumentation (TypeScript)
|
||||
|
||||
Add custom spans using convenience wrappers or withSpan for fine-grained tracing control.
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
npm install @arizeai/phoenix-otel @arizeai/openinference-core
|
||||
```
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
register({ projectName: "my-app" });
|
||||
```
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Span Kind | Method | Use Case |
|
||||
|-----------|--------|----------|
|
||||
| CHAIN | `traceChain` | Workflows, pipelines, orchestration |
|
||||
| AGENT | `traceAgent` | Multi-step reasoning, planning |
|
||||
| TOOL | `traceTool` | External APIs, function calls |
|
||||
| RETRIEVER | `withSpan` | Vector search, document retrieval |
|
||||
| LLM | `withSpan` | LLM API calls (prefer auto-instrumentation) |
|
||||
| EMBEDDING | `withSpan` | Embedding generation |
|
||||
| RERANKER | `withSpan` | Document re-ranking |
|
||||
| GUARDRAIL | `withSpan` | Safety checks, content moderation |
|
||||
| EVALUATOR | `withSpan` | LLM evaluation |
|
||||
|
||||
## Convenience Wrappers
|
||||
|
||||
```typescript
|
||||
import { traceChain, traceAgent, traceTool } from "@arizeai/openinference-core";
|
||||
|
||||
// CHAIN - workflows
|
||||
const pipeline = traceChain(
|
||||
async (query: string) => {
|
||||
const docs = await retrieve(query);
|
||||
return await generate(docs, query);
|
||||
},
|
||||
{ name: "rag-pipeline" }
|
||||
);
|
||||
|
||||
// AGENT - reasoning
|
||||
const agent = traceAgent(
|
||||
async (question: string) => {
|
||||
const thought = await llm.generate(`Think: ${question}`);
|
||||
return await processThought(thought);
|
||||
},
|
||||
{ name: "my-agent" }
|
||||
);
|
||||
|
||||
// TOOL - function calls
|
||||
const getWeather = traceTool(
|
||||
async (city: string) => fetch(`/api/weather/${city}`).then(r => r.json()),
|
||||
{ name: "get-weather" }
|
||||
);
|
||||
```
|
||||
|
||||
## withSpan for Other Kinds
|
||||
|
||||
```typescript
|
||||
import { withSpan, getInputAttributes, getRetrieverAttributes } from "@arizeai/openinference-core";
|
||||
|
||||
// RETRIEVER with custom attributes
|
||||
const retrieve = withSpan(
|
||||
async (query: string) => {
|
||||
const results = await vectorDb.search(query, { topK: 5 });
|
||||
return results.map(doc => ({ content: doc.text, score: doc.score }));
|
||||
},
|
||||
{
|
||||
kind: "RETRIEVER",
|
||||
name: "vector-search",
|
||||
processInput: (query) => getInputAttributes(query),
|
||||
processOutput: (docs) => getRetrieverAttributes({ documents: docs })
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
```typescript
|
||||
withSpan(fn, {
|
||||
kind: "RETRIEVER", // OpenInference span kind
|
||||
name: "span-name", // Span name (defaults to function name)
|
||||
processInput: (args) => {}, // Transform input to attributes
|
||||
processOutput: (result) => {}, // Transform output to attributes
|
||||
attributes: { key: "value" } // Static attributes
|
||||
});
|
||||
```
|
||||
|
||||
## Capturing Input/Output
|
||||
|
||||
**Always capture I/O for evaluation-ready spans.** Use `getInputAttributes` and `getOutputAttributes` helpers for automatic MIME type detection:
|
||||
|
||||
```typescript
|
||||
import {
|
||||
getInputAttributes,
|
||||
getOutputAttributes,
|
||||
withSpan,
|
||||
} from "@arizeai/openinference-core";
|
||||
|
||||
const handleQuery = withSpan(
|
||||
async (userInput: string) => {
|
||||
const result = await agent.generate({ prompt: userInput });
|
||||
return result;
|
||||
},
|
||||
{
|
||||
name: "query.handler",
|
||||
kind: "CHAIN",
|
||||
// Use helpers - automatic MIME type detection
|
||||
processInput: (input) => getInputAttributes(input),
|
||||
processOutput: (result) => getOutputAttributes(result.text),
|
||||
}
|
||||
);
|
||||
|
||||
await handleQuery("What is 2+2?");
|
||||
```
|
||||
|
||||
**What gets captured:**
|
||||
|
||||
```json
|
||||
{
|
||||
"input.value": "What is 2+2?",
|
||||
"input.mime_type": "text/plain",
|
||||
"output.value": "2+2 equals 4.",
|
||||
"output.mime_type": "text/plain"
|
||||
}
|
||||
```
|
||||
|
||||
**Helper behavior:**
|
||||
- Strings → `text/plain`
|
||||
- Objects/Arrays → `application/json` (automatically serialized)
|
||||
- `undefined`/`null` → No attributes set
|
||||
|
||||
**Why this matters:**
|
||||
- Phoenix evaluators require `input.value` and `output.value`
|
||||
- Phoenix UI displays I/O prominently for debugging
|
||||
- Enables exporting data for fine-tuning datasets
|
||||
|
||||
### Custom I/O Processing
|
||||
|
||||
Add custom metadata alongside standard I/O attributes:
|
||||
|
||||
```typescript
|
||||
const processWithMetadata = withSpan(
|
||||
async (query: string) => {
|
||||
const result = await llm.generate(query);
|
||||
return result;
|
||||
},
|
||||
{
|
||||
name: "query.process",
|
||||
kind: "CHAIN",
|
||||
processInput: (query) => ({
|
||||
"input.value": query,
|
||||
"input.mime_type": "text/plain",
|
||||
"input.length": query.length, // Custom attribute
|
||||
}),
|
||||
processOutput: (result) => ({
|
||||
"output.value": result.text,
|
||||
"output.mime_type": "text/plain",
|
||||
"output.tokens": result.usage?.totalTokens, // Custom attribute
|
||||
}),
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- **Span attributes:** `span-chain.md`, `span-retriever.md`, `span-tool.md`, etc.
|
||||
- **Attribute helpers:** https://docs.arize.com/phoenix/tracing/manual-instrumentation-typescript#attribute-helpers
|
||||
- **Auto-instrumentation:** `instrumentation-auto-typescript.md` for framework integrations
|
||||
@@ -0,0 +1,87 @@
|
||||
# Phoenix Tracing: Custom Metadata (Python)
|
||||
|
||||
Add custom attributes to spans for richer observability.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
pip install openinference-instrumentation
|
||||
```
|
||||
|
||||
## Session
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import using_session
|
||||
|
||||
with using_session(session_id="my-session-id"):
|
||||
# Spans get: "session.id" = "my-session-id"
|
||||
...
|
||||
```
|
||||
|
||||
## User
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import using_user
|
||||
|
||||
with using_user("my-user-id"):
|
||||
# Spans get: "user.id" = "my-user-id"
|
||||
...
|
||||
```
|
||||
|
||||
## Metadata
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import using_metadata
|
||||
|
||||
with using_metadata({"key": "value", "experiment_id": "exp_123"}):
|
||||
# Spans get: "metadata" = '{"key": "value", "experiment_id": "exp_123"}'
|
||||
...
|
||||
```
|
||||
|
||||
## Tags
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import using_tags
|
||||
|
||||
with using_tags(["tag_1", "tag_2"]):
|
||||
# Spans get: "tag.tags" = '["tag_1", "tag_2"]'
|
||||
...
|
||||
```
|
||||
|
||||
## Combined (using_attributes)
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import using_attributes
|
||||
|
||||
with using_attributes(
|
||||
session_id="my-session-id",
|
||||
user_id="my-user-id",
|
||||
metadata={"environment": "production"},
|
||||
tags=["prod", "v2"],
|
||||
prompt_template="Answer: {question}",
|
||||
prompt_template_version="v1.0",
|
||||
prompt_template_variables={"question": "What is Phoenix?"},
|
||||
):
|
||||
# All attributes applied to spans in this context
|
||||
...
|
||||
```
|
||||
|
||||
## On a Single Span
|
||||
|
||||
```python
|
||||
span.set_attribute("metadata", json.dumps({"key": "value"}))
|
||||
span.set_attribute("user.id", "user_123")
|
||||
span.set_attribute("session.id", "session_456")
|
||||
```
|
||||
|
||||
## As Decorators
|
||||
|
||||
All context managers can be used as decorators:
|
||||
|
||||
```python
|
||||
@using_session(session_id="my-session-id")
|
||||
@using_user("my-user-id")
|
||||
@using_metadata({"env": "prod"})
|
||||
def my_function():
|
||||
...
|
||||
```
|
||||
@@ -0,0 +1,50 @@
|
||||
# Phoenix Tracing: Custom Metadata (TypeScript)
|
||||
|
||||
Add custom attributes to spans for richer observability.
|
||||
|
||||
## Using Context (Propagates to All Child Spans)
|
||||
|
||||
```typescript
|
||||
import { context } from "@arizeai/phoenix-otel";
|
||||
import { setMetadata } from "@arizeai/openinference-core";
|
||||
|
||||
context.with(
|
||||
setMetadata(context.active(), {
|
||||
experiment_id: "exp_123",
|
||||
model_version: "gpt-4-1106-preview",
|
||||
environment: "production",
|
||||
}),
|
||||
async () => {
|
||||
// All spans created within this block will have:
|
||||
// "metadata" = '{"experiment_id": "exp_123", ...}'
|
||||
await myApp.run(query);
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
## On a Single Span
|
||||
|
||||
```typescript
|
||||
import { traceChain } from "@arizeai/openinference-core";
|
||||
import { trace } from "@arizeai/phoenix-otel";
|
||||
|
||||
const myFunction = traceChain(
|
||||
async (input: string) => {
|
||||
const span = trace.getActiveSpan();
|
||||
|
||||
span?.setAttribute(
|
||||
"metadata",
|
||||
JSON.stringify({
|
||||
experiment_id: "exp_123",
|
||||
model_version: "gpt-4-1106-preview",
|
||||
environment: "production",
|
||||
})
|
||||
);
|
||||
|
||||
return result;
|
||||
},
|
||||
{ name: "my-function" }
|
||||
);
|
||||
|
||||
await myFunction("hello");
|
||||
```
|
||||
@@ -0,0 +1,58 @@
|
||||
# Phoenix Tracing: Production Guide (Python)
|
||||
|
||||
**CRITICAL: Configure batching, data masking, and span filtering for production deployment.**
|
||||
|
||||
## Metadata
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| Priority | Critical - production readiness |
|
||||
| Impact | Security, Performance |
|
||||
| Setup Time | 5-15 min |
|
||||
|
||||
## Batch Processing
|
||||
|
||||
**Enable batch processing for production efficiency.** Batching reduces network overhead by sending spans in groups rather than individually.
|
||||
|
||||
## Data Masking (PII Protection)
|
||||
|
||||
**Environment variables:**
|
||||
|
||||
```bash
|
||||
export OPENINFERENCE_HIDE_INPUTS=true # Hide input.value
|
||||
export OPENINFERENCE_HIDE_OUTPUTS=true # Hide output.value
|
||||
export OPENINFERENCE_HIDE_INPUT_MESSAGES=true # Hide LLM input messages
|
||||
export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
|
||||
export OPENINFERENCE_HIDE_INPUT_IMAGES=true # Hide image content
|
||||
export OPENINFERENCE_HIDE_INPUT_TEXT=true # Hide embedding text
|
||||
export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000 # Limit image size
|
||||
```
|
||||
|
||||
**Python TraceConfig:**
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
from openinference.instrumentation import TraceConfig
|
||||
|
||||
config = TraceConfig(
|
||||
hide_inputs=True,
|
||||
hide_outputs=True,
|
||||
hide_input_messages=True
|
||||
)
|
||||
register(trace_config=config)
|
||||
```
|
||||
|
||||
**Precedence:** Code > Environment variables > Defaults
|
||||
|
||||
---
|
||||
|
||||
## Span Filtering
|
||||
|
||||
**Suppress specific code blocks:**
|
||||
|
||||
```python
|
||||
from phoenix.otel import suppress_tracing
|
||||
|
||||
with suppress_tracing():
|
||||
internal_logging() # No spans generated
|
||||
```
|
||||
@@ -0,0 +1,148 @@
|
||||
# Phoenix Tracing: Production Guide (TypeScript)
|
||||
|
||||
**CRITICAL: Configure batching, data masking, and span filtering for production deployment.**
|
||||
|
||||
## Metadata
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| Priority | Critical - production readiness |
|
||||
| Impact | Security, Performance |
|
||||
| Setup Time | 5-15 min |
|
||||
|
||||
## Batch Processing
|
||||
|
||||
**Enable batch processing for production efficiency.** Batching reduces network overhead by sending spans in groups rather than individually.
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
|
||||
const provider = register({
|
||||
projectName: "my-app",
|
||||
batch: true, // Production default
|
||||
});
|
||||
```
|
||||
|
||||
### Shutdown Handling
|
||||
|
||||
**CRITICAL:** Spans may not be exported if still queued in the processor when your process exits. Call `provider.shutdown()` to explicitly flush before exit.
|
||||
|
||||
```typescript
|
||||
// Explicit shutdown to flush queued spans
|
||||
const provider = register({
|
||||
projectName: "my-app",
|
||||
batch: true,
|
||||
});
|
||||
|
||||
async function main() {
|
||||
await doWork();
|
||||
await provider.shutdown(); // Flush spans before exit
|
||||
}
|
||||
|
||||
main().catch(async (error) => {
|
||||
console.error(error);
|
||||
await provider.shutdown(); // Flush on error too
|
||||
process.exit(1);
|
||||
});
|
||||
```
|
||||
|
||||
**Graceful termination signals:**
|
||||
|
||||
```typescript
|
||||
// Graceful shutdown on SIGTERM
|
||||
const provider = register({
|
||||
projectName: "my-server",
|
||||
batch: true,
|
||||
});
|
||||
|
||||
process.on("SIGTERM", async () => {
|
||||
await provider.shutdown();
|
||||
process.exit(0);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Masking (PII Protection)
|
||||
|
||||
**Environment variables:**
|
||||
|
||||
```bash
|
||||
export OPENINFERENCE_HIDE_INPUTS=true # Hide input.value
|
||||
export OPENINFERENCE_HIDE_OUTPUTS=true # Hide output.value
|
||||
export OPENINFERENCE_HIDE_INPUT_MESSAGES=true # Hide LLM input messages
|
||||
export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
|
||||
export OPENINFERENCE_HIDE_INPUT_IMAGES=true # Hide image content
|
||||
export OPENINFERENCE_HIDE_INPUT_TEXT=true # Hide embedding text
|
||||
export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000 # Limit image size
|
||||
```
|
||||
|
||||
**TypeScript TraceConfig:**
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
|
||||
|
||||
const traceConfig = {
|
||||
hideInputs: true,
|
||||
hideOutputs: true,
|
||||
hideInputMessages: true
|
||||
};
|
||||
|
||||
const instrumentation = new OpenAIInstrumentation({ traceConfig });
|
||||
```
|
||||
|
||||
**Precedence:** Code > Environment variables > Defaults
|
||||
|
||||
---
|
||||
|
||||
## Span Filtering
|
||||
|
||||
**Suppress specific code blocks:**
|
||||
|
||||
```typescript
|
||||
import { suppressTracing } from "@opentelemetry/core";
|
||||
import { context } from "@opentelemetry/api";
|
||||
|
||||
await context.with(suppressTracing(context.active()), async () => {
|
||||
internalLogging(); // No spans generated
|
||||
});
|
||||
```
|
||||
|
||||
**Sampling:**
|
||||
|
||||
```bash
|
||||
export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
|
||||
export OTEL_TRACES_SAMPLER_ARG="0.1" # Sample 10%
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
```typescript
|
||||
import { SpanStatusCode } from "@opentelemetry/api";
|
||||
|
||||
try {
|
||||
result = await riskyOperation();
|
||||
span?.setStatus({ code: SpanStatusCode.OK });
|
||||
} catch (e) {
|
||||
span?.recordException(e);
|
||||
span?.setStatus({ code: SpanStatusCode.ERROR });
|
||||
throw e;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Checklist
|
||||
|
||||
- [ ] Batch processing enabled
|
||||
- [ ] **Shutdown handling:** Call `provider.shutdown()` before exit to flush queued spans
|
||||
- [ ] **Graceful termination:** Flush spans on SIGTERM/SIGINT signals
|
||||
- [ ] Data masking configured (`HIDE_INPUTS`/`HIDE_OUTPUTS` if PII)
|
||||
- [ ] Span filtering for health checks/noisy paths
|
||||
- [ ] Error handling implemented
|
||||
- [ ] Graceful degradation if Phoenix unavailable
|
||||
- [ ] Performance tested
|
||||
- [ ] Monitoring configured (Phoenix UI checked)
|
||||
@@ -0,0 +1,73 @@
|
||||
# Phoenix Tracing: Projects (Python)
|
||||
|
||||
**Organize traces by application using projects (Phoenix's top-level grouping).**
|
||||
|
||||
## Overview
|
||||
|
||||
Projects group traces for a single application or experiment.
|
||||
|
||||
**Use for:** Environments (dev/staging/prod), A/B testing, versioning
|
||||
|
||||
## Setup
|
||||
|
||||
### Environment Variable (Recommended)
|
||||
|
||||
```bash
|
||||
export PHOENIX_PROJECT_NAME="my-app-prod"
|
||||
```
|
||||
|
||||
```python
|
||||
import os
|
||||
os.environ["PHOENIX_PROJECT_NAME"] = "my-app-prod"
|
||||
from phoenix.otel import register
|
||||
register() # Uses "my-app-prod"
|
||||
```
|
||||
|
||||
### Code
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
register(project_name="my-app-prod")
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
**Environments:**
|
||||
|
||||
```python
|
||||
# Dev, staging, prod
|
||||
register(project_name="my-app-dev")
|
||||
register(project_name="my-app-staging")
|
||||
register(project_name="my-app-prod")
|
||||
```
|
||||
|
||||
**A/B Testing:**
|
||||
|
||||
```python
|
||||
# Compare models
|
||||
register(project_name="chatbot-gpt4")
|
||||
register(project_name="chatbot-claude")
|
||||
```
|
||||
|
||||
**Versioning:**
|
||||
|
||||
```python
|
||||
# Track versions
|
||||
register(project_name="my-app-v1")
|
||||
register(project_name="my-app-v2")
|
||||
```
|
||||
|
||||
## Switching Projects (Python Notebooks Only)
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import dangerously_using_project
|
||||
from phoenix.otel import register
|
||||
|
||||
register(project_name="my-app")
|
||||
|
||||
# Switch temporarily for evals
|
||||
with dangerously_using_project("my-eval-project"):
|
||||
run_evaluations()
|
||||
```
|
||||
|
||||
**⚠️ Only use in notebooks/scripts, not production.**
|
||||
@@ -0,0 +1,54 @@
|
||||
# Phoenix Tracing: Projects (TypeScript)
|
||||
|
||||
**Organize traces by application using projects (Phoenix's top-level grouping).**
|
||||
|
||||
## Overview
|
||||
|
||||
Projects group traces for a single application or experiment.
|
||||
|
||||
**Use for:** Environments (dev/staging/prod), A/B testing, versioning
|
||||
|
||||
## Setup
|
||||
|
||||
### Environment Variable (Recommended)
|
||||
|
||||
```bash
|
||||
export PHOENIX_PROJECT_NAME="my-app-prod"
|
||||
```
|
||||
|
||||
```typescript
|
||||
process.env.PHOENIX_PROJECT_NAME = "my-app-prod";
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
register(); // Uses "my-app-prod"
|
||||
```
|
||||
|
||||
### Code
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
register({ projectName: "my-app-prod" });
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
**Environments:**
|
||||
```typescript
|
||||
// Dev, staging, prod
|
||||
register({ projectName: "my-app-dev" });
|
||||
register({ projectName: "my-app-staging" });
|
||||
register({ projectName: "my-app-prod" });
|
||||
```
|
||||
|
||||
**A/B Testing:**
|
||||
```typescript
|
||||
// Compare models
|
||||
register({ projectName: "chatbot-gpt4" });
|
||||
register({ projectName: "chatbot-claude" });
|
||||
```
|
||||
|
||||
**Versioning:**
|
||||
```typescript
|
||||
// Track versions
|
||||
register({ projectName: "my-app-v1" });
|
||||
register({ projectName: "my-app-v2" });
|
||||
```
|
||||
@@ -0,0 +1,104 @@
|
||||
# Sessions (Python)
|
||||
|
||||
Track multi-turn conversations by grouping traces with session IDs.
|
||||
|
||||
## Setup
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import using_session
|
||||
|
||||
with using_session(session_id="user_123_conv_456"):
|
||||
response = llm.invoke(prompt)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
**Bad: Only parent span gets session ID**
|
||||
|
||||
```python
|
||||
from openinference.semconv.trace import SpanAttributes
|
||||
from opentelemetry import trace
|
||||
|
||||
span = trace.get_current_span()
|
||||
span.set_attribute(SpanAttributes.SESSION_ID, session_id)
|
||||
response = client.chat.completions.create(...)
|
||||
```
|
||||
|
||||
**Good: All child spans inherit session ID**
|
||||
|
||||
```python
|
||||
with using_session(session_id):
|
||||
response = client.chat.completions.create(...)
|
||||
result = my_custom_function()
|
||||
```
|
||||
|
||||
**Why:** `using_session()` propagates session ID to all nested spans automatically.
|
||||
|
||||
## Session ID Patterns
|
||||
|
||||
```python
|
||||
import uuid
|
||||
|
||||
session_id = str(uuid.uuid4())
|
||||
session_id = f"user_{user_id}_conv_{conversation_id}"
|
||||
session_id = f"debug_{timestamp}"
|
||||
```
|
||||
|
||||
Good: `str(uuid.uuid4())`, `"user_123_conv_456"`
|
||||
Bad: `"session_1"`, `"test"`, empty string
|
||||
|
||||
## Multi-Turn Chatbot Example
|
||||
|
||||
```python
|
||||
import uuid
|
||||
from openinference.instrumentation import using_session
|
||||
|
||||
session_id = str(uuid.uuid4())
|
||||
messages = []
|
||||
|
||||
def send_message(user_input: str) -> str:
|
||||
messages.append({"role": "user", "content": user_input})
|
||||
|
||||
with using_session(session_id):
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-4",
|
||||
messages=messages
|
||||
)
|
||||
|
||||
assistant_message = response.choices[0].message.content
|
||||
messages.append({"role": "assistant", "content": assistant_message})
|
||||
return assistant_message
|
||||
```
|
||||
|
||||
## Additional Attributes
|
||||
|
||||
```python
|
||||
from openinference.instrumentation import using_attributes
|
||||
|
||||
with using_attributes(
|
||||
user_id="user_123",
|
||||
session_id="conv_456",
|
||||
metadata={"tier": "premium", "region": "us-west"}
|
||||
):
|
||||
response = llm.invoke(prompt)
|
||||
```
|
||||
|
||||
## LangChain Integration
|
||||
|
||||
LangChain threads are automatically recognized as sessions:
|
||||
|
||||
```python
|
||||
from langchain.chat_models import ChatOpenAI
|
||||
|
||||
response = llm.invoke(
|
||||
[HumanMessage(content="Hi!")],
|
||||
config={"metadata": {"thread_id": "user_123_thread"}}
|
||||
)
|
||||
```
|
||||
|
||||
Phoenix recognizes: `thread_id`, `session_id`, `conversation_id`
|
||||
|
||||
## See Also
|
||||
|
||||
- **TypeScript sessions:** `sessions-typescript.md`
|
||||
- **Session docs:** https://docs.arize.com/phoenix/tracing/sessions
|
||||
@@ -0,0 +1,199 @@
|
||||
# Sessions (TypeScript)
|
||||
|
||||
Track multi-turn conversations by grouping traces with session IDs. **Use `withSpan` directly from `@arizeai/openinference-core`** - no wrappers or custom utilities needed.
|
||||
|
||||
## Core Concept
|
||||
|
||||
**Session Pattern:**
|
||||
1. Generate a unique `session.id` once at application startup
|
||||
2. Export SESSION_ID, import `withSpan` where needed
|
||||
3. Use `withSpan` to create a parent CHAIN span with `session.id` for each interaction
|
||||
4. All child spans (LLM, TOOL, AGENT, etc.) automatically group under the parent
|
||||
5. Query traces by `session.id` in Phoenix to see all interactions
|
||||
|
||||
## Implementation (Best Practice)
|
||||
|
||||
### 1. Setup (instrumentation.ts)
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
import { randomUUID } from "node:crypto";
|
||||
|
||||
// Initialize Phoenix
|
||||
register({
|
||||
projectName: "your-app",
|
||||
url: process.env.PHOENIX_COLLECTOR_ENDPOINT || "http://localhost:6006",
|
||||
apiKey: process.env.PHOENIX_API_KEY,
|
||||
batch: true,
|
||||
});
|
||||
|
||||
// Generate and export session ID
|
||||
export const SESSION_ID = randomUUID();
|
||||
```
|
||||
|
||||
### 2. Usage (app code)
|
||||
|
||||
```typescript
|
||||
import { withSpan } from "@arizeai/openinference-core";
|
||||
import { SESSION_ID } from "./instrumentation";
|
||||
|
||||
// Use withSpan directly - no wrapper needed
|
||||
const handleInteraction = withSpan(
|
||||
async () => {
|
||||
const result = await agent.generate({ prompt: userInput });
|
||||
return result;
|
||||
},
|
||||
{
|
||||
name: "cli.interaction",
|
||||
kind: "CHAIN",
|
||||
attributes: { "session.id": SESSION_ID },
|
||||
}
|
||||
);
|
||||
|
||||
// Call it
|
||||
const result = await handleInteraction();
|
||||
```
|
||||
|
||||
### With Input Parameters
|
||||
|
||||
```typescript
|
||||
const processQuery = withSpan(
|
||||
async (query: string) => {
|
||||
return await agent.generate({ prompt: query });
|
||||
},
|
||||
{
|
||||
name: "process.query",
|
||||
kind: "CHAIN",
|
||||
attributes: { "session.id": SESSION_ID },
|
||||
}
|
||||
);
|
||||
|
||||
await processQuery("What is 2+2?");
|
||||
```
|
||||
|
||||
## Key Points
|
||||
|
||||
### Session ID Scope
|
||||
- **CLI/Desktop Apps**: Generate once at process startup
|
||||
- **Web Servers**: Generate per-user session (e.g., on login, store in session storage)
|
||||
- **Stateless APIs**: Accept session.id as a parameter from client
|
||||
|
||||
### Span Hierarchy
|
||||
```
|
||||
cli.interaction (CHAIN) ← session.id here
|
||||
├── ai.generateText (AGENT)
|
||||
│ ├── ai.generateText.doGenerate (LLM)
|
||||
│ └── ai.toolCall (TOOL)
|
||||
└── ai.generateText.doGenerate (LLM)
|
||||
```
|
||||
|
||||
The `session.id` is only set on the **root span**. Child spans are automatically grouped by the trace hierarchy.
|
||||
|
||||
### Querying Sessions
|
||||
|
||||
```bash
|
||||
# Get all traces for a session
|
||||
npx @arizeai/phoenix-cli traces \
|
||||
--endpoint http://localhost:6006 \
|
||||
--project your-app \
|
||||
--format raw \
|
||||
--no-progress | \
|
||||
jq '.[] | select(.spans[0].attributes["session.id"] == "YOUR-SESSION-ID")'
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
```json
|
||||
{
|
||||
"dependencies": {
|
||||
"@arizeai/openinference-core": "^2.0.5",
|
||||
"@arizeai/phoenix-otel": "^0.4.1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Note:** `@opentelemetry/api` is NOT needed - it's only for manual span management.
|
||||
|
||||
## Why This Pattern?
|
||||
|
||||
1. **Simple**: Just export SESSION_ID, use withSpan directly - no wrappers
|
||||
2. **Built-in**: `withSpan` from `@arizeai/openinference-core` handles everything
|
||||
3. **Type-safe**: Preserves function signatures and type information
|
||||
4. **Automatic lifecycle**: Handles span creation, error tracking, and cleanup
|
||||
5. **Framework-agnostic**: Works with any LLM framework (AI SDK, LangChain, etc.)
|
||||
6. **No extra deps**: Don't need `@opentelemetry/api` or custom utilities
|
||||
|
||||
## Adding More Attributes
|
||||
|
||||
```typescript
|
||||
import { withSpan } from "@arizeai/openinference-core";
|
||||
import { SESSION_ID } from "./instrumentation";
|
||||
|
||||
const handleWithContext = withSpan(
|
||||
async (userInput: string) => {
|
||||
return await agent.generate({ prompt: userInput });
|
||||
},
|
||||
{
|
||||
name: "cli.interaction",
|
||||
kind: "CHAIN",
|
||||
attributes: {
|
||||
"session.id": SESSION_ID,
|
||||
"user.id": userId, // Track user
|
||||
"metadata.environment": "prod", // Custom metadata
|
||||
},
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
## Anti-Pattern: Don't Create Wrappers
|
||||
|
||||
❌ **Don't do this:**
|
||||
```typescript
|
||||
// Unnecessary wrapper
|
||||
export function withSessionTracking(fn) {
|
||||
return withSpan(fn, { attributes: { "session.id": SESSION_ID } });
|
||||
}
|
||||
```
|
||||
|
||||
✅ **Do this instead:**
|
||||
```typescript
|
||||
// Use withSpan directly
|
||||
import { withSpan } from "@arizeai/openinference-core";
|
||||
import { SESSION_ID } from "./instrumentation";
|
||||
|
||||
const handler = withSpan(fn, {
|
||||
attributes: { "session.id": SESSION_ID }
|
||||
});
|
||||
```
|
||||
|
||||
## Alternative: Context API Pattern
|
||||
|
||||
For web servers or complex async flows where you need to propagate session IDs through middleware, you can use the Context API:
|
||||
|
||||
```typescript
|
||||
import { context } from "@opentelemetry/api";
|
||||
import { setSession } from "@arizeai/openinference-core";
|
||||
|
||||
await context.with(
|
||||
setSession(context.active(), { sessionId: "user_123_conv_456" }),
|
||||
async () => {
|
||||
const response = await llm.invoke(prompt);
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
**Use Context API when:**
|
||||
- Building web servers with middleware chains
|
||||
- Session ID needs to flow through many async boundaries
|
||||
- You don't control the call stack (e.g., framework-provided handlers)
|
||||
|
||||
**Use withSpan when:**
|
||||
- Building CLI apps or scripts
|
||||
- You control the function call points
|
||||
- Simpler, more explicit code is preferred
|
||||
|
||||
## Related
|
||||
|
||||
- `fundamentals-universal-attributes.md` - Other universal attributes (user.id, metadata)
|
||||
- `span-chain.md` - CHAIN span specification
|
||||
- `sessions-python.md` - Python session tracking patterns
|
||||
@@ -0,0 +1,131 @@
|
||||
# Phoenix Tracing: Python Setup
|
||||
|
||||
**Setup Phoenix tracing in Python with `arize-phoenix-otel`.**
|
||||
|
||||
## Metadata
|
||||
|
||||
| Attribute | Value |
|
||||
| ---------- | ----------------------------------- |
|
||||
| Priority | Critical - required for all tracing |
|
||||
| Setup Time | <5 min |
|
||||
|
||||
## Quick Start (3 lines)
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
register(project_name="my-app", auto_instrument=True)
|
||||
```
|
||||
|
||||
**Connects to `http://localhost:6006`, auto-instruments all supported libraries.**
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install arize-phoenix-otel
|
||||
```
|
||||
|
||||
**Supported:** Python 3.10-3.13
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables (Recommended)
|
||||
|
||||
```bash
|
||||
export PHOENIX_API_KEY="your-api-key" # Required for Phoenix Cloud
|
||||
export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006" # Or Cloud URL
|
||||
export PHOENIX_PROJECT_NAME="my-app" # Optional
|
||||
```
|
||||
|
||||
### Python Code
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
|
||||
tracer_provider = register(
|
||||
project_name="my-app", # Project name
|
||||
endpoint="http://localhost:6006", # Phoenix endpoint
|
||||
auto_instrument=True, # Auto-instrument supported libs
|
||||
batch=True, # Batch processing (default: True)
|
||||
)
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
|
||||
- `project_name`: Project name (overrides `PHOENIX_PROJECT_NAME`)
|
||||
- `endpoint`: Phoenix URL (overrides `PHOENIX_COLLECTOR_ENDPOINT`)
|
||||
- `auto_instrument`: Enable auto-instrumentation (default: False)
|
||||
- `batch`: Use BatchSpanProcessor (default: True, production-recommended)
|
||||
- `protocol`: `"http/protobuf"` (default) or `"grpc"`
|
||||
|
||||
## Auto-Instrumentation
|
||||
|
||||
Install instrumentors for your frameworks:
|
||||
|
||||
```bash
|
||||
pip install openinference-instrumentation-openai # OpenAI SDK
|
||||
pip install openinference-instrumentation-langchain # LangChain
|
||||
pip install openinference-instrumentation-llama-index # LlamaIndex
|
||||
# ... install others as needed
|
||||
```
|
||||
|
||||
Then enable auto-instrumentation:
|
||||
|
||||
```python
|
||||
register(project_name="my-app", auto_instrument=True)
|
||||
```
|
||||
|
||||
Phoenix discovers and instruments all installed OpenInference packages automatically.
|
||||
|
||||
## Batch Processing (Production)
|
||||
|
||||
Enabled by default. Configure via environment variables:
|
||||
|
||||
```bash
|
||||
export OTEL_BSP_SCHEDULE_DELAY=5000 # Batch every 5s
|
||||
export OTEL_BSP_MAX_QUEUE_SIZE=2048 # Queue 2048 spans
|
||||
export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=512 # Send 512 spans/batch
|
||||
```
|
||||
|
||||
**Link:** https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
|
||||
|
||||
## Verification
|
||||
|
||||
1. Open Phoenix UI: `http://localhost:6006`
|
||||
2. Navigate to your project
|
||||
3. Run your application
|
||||
4. Check for traces (appear within batch delay)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**No traces:**
|
||||
|
||||
- Verify `PHOENIX_COLLECTOR_ENDPOINT` matches Phoenix server
|
||||
- Set `PHOENIX_API_KEY` for Phoenix Cloud
|
||||
- Confirm instrumentors installed
|
||||
|
||||
**Missing attributes:**
|
||||
|
||||
- Check span kind (see rules/ directory)
|
||||
- Verify attribute names (see rules/ directory)
|
||||
|
||||
## Example
|
||||
|
||||
```python
|
||||
from phoenix.otel import register
|
||||
from openai import OpenAI
|
||||
|
||||
# Enable tracing with auto-instrumentation
|
||||
register(project_name="my-chatbot", auto_instrument=True)
|
||||
|
||||
# OpenAI automatically instrumented
|
||||
client = OpenAI()
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-4",
|
||||
messages=[{"role": "user", "content": "Hello!"}]
|
||||
)
|
||||
```
|
||||
|
||||
## API Reference
|
||||
|
||||
- [Python OTEL API Docs](https://arize-phoenix.readthedocs.io/projects/otel/en/latest/)
|
||||
- [Python Client API Docs](https://arize-phoenix.readthedocs.io/projects/client/en/latest/)
|
||||
@@ -0,0 +1,170 @@
|
||||
# TypeScript Setup
|
||||
|
||||
Setup Phoenix tracing in TypeScript/JavaScript with `@arizeai/phoenix-otel`.
|
||||
|
||||
## Metadata
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| Priority | Critical - required for all tracing |
|
||||
| Setup Time | <5 min |
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
npm install @arizeai/phoenix-otel
|
||||
```
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
register({ projectName: "my-app" });
|
||||
```
|
||||
|
||||
Connects to `http://localhost:6006` by default.
|
||||
|
||||
## Configuration
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
|
||||
register({
|
||||
projectName: "my-app",
|
||||
url: "http://localhost:6006",
|
||||
apiKey: process.env.PHOENIX_API_KEY,
|
||||
batch: true
|
||||
});
|
||||
```
|
||||
|
||||
**Environment variables:**
|
||||
|
||||
```bash
|
||||
export PHOENIX_API_KEY="your-api-key"
|
||||
export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"
|
||||
export PHOENIX_PROJECT_NAME="my-app"
|
||||
```
|
||||
|
||||
## ESM vs CommonJS
|
||||
|
||||
**CommonJS (automatic):**
|
||||
|
||||
```javascript
|
||||
const { register } = require("@arizeai/phoenix-otel");
|
||||
register({ projectName: "my-app" });
|
||||
|
||||
const OpenAI = require("openai");
|
||||
```
|
||||
|
||||
**ESM (manual instrumentation required):**
|
||||
|
||||
```typescript
|
||||
import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
|
||||
import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
|
||||
import OpenAI from "openai";
|
||||
|
||||
register({ projectName: "my-app" });
|
||||
|
||||
const instrumentation = new OpenAIInstrumentation();
|
||||
instrumentation.manuallyInstrument(OpenAI);
|
||||
registerInstrumentations({ instrumentations: [instrumentation] });
|
||||
```
|
||||
|
||||
**Why:** ESM imports are hoisted, so `manuallyInstrument()` is needed.
|
||||
|
||||
## Framework Integration
|
||||
|
||||
**Next.js (App Router):**
|
||||
|
||||
```typescript
|
||||
// instrumentation.ts
|
||||
export async function register() {
|
||||
if (process.env.NEXT_RUNTIME === "nodejs") {
|
||||
const { register } = await import("@arizeai/phoenix-otel");
|
||||
register({ projectName: "my-nextjs-app" });
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Express.js:**
|
||||
|
||||
```typescript
|
||||
import { register } from "@arizeai/phoenix-otel";
|
||||
|
||||
register({ projectName: "my-express-app" });
|
||||
|
||||
const app = express();
|
||||
```
|
||||
|
||||
## Flushing Spans Before Exit
|
||||
|
||||
**CRITICAL:** Spans may not be exported if still queued in the processor when your process exits. Call `provider.shutdown()` to explicitly flush before exit.
|
||||
|
||||
**Standard pattern:**
|
||||
|
||||
```typescript
|
||||
const provider = register({
|
||||
projectName: "my-app",
|
||||
batch: true,
|
||||
});
|
||||
|
||||
async function main() {
|
||||
await doWork();
|
||||
await provider.shutdown(); // Flush spans before exit
|
||||
}
|
||||
|
||||
main().catch(async (error) => {
|
||||
console.error(error);
|
||||
await provider.shutdown(); // Flush on error too
|
||||
process.exit(1);
|
||||
});
|
||||
```
|
||||
|
||||
**Alternative:**
|
||||
|
||||
```typescript
|
||||
// Use batch: false for immediate export (no shutdown needed)
|
||||
register({
|
||||
projectName: "my-app",
|
||||
batch: false,
|
||||
});
|
||||
```
|
||||
|
||||
For production patterns including graceful termination, see `production-typescript.md`.
|
||||
|
||||
## Verification
|
||||
|
||||
1. Open Phoenix UI: `http://localhost:6006`
|
||||
2. Run your application
|
||||
3. Check for traces in your project
|
||||
|
||||
**Enable diagnostic logging:**
|
||||
|
||||
```typescript
|
||||
import { DiagLogLevel, register } from "@arizeai/phoenix-otel";
|
||||
|
||||
register({
|
||||
projectName: "my-app",
|
||||
diagLogLevel: DiagLogLevel.DEBUG,
|
||||
});
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**No traces:**
|
||||
- Verify `PHOENIX_COLLECTOR_ENDPOINT` is correct
|
||||
- Set `PHOENIX_API_KEY` for Phoenix Cloud
|
||||
- For ESM: Ensure `manuallyInstrument()` is called
|
||||
- **With `batch: true`:** Call `provider.shutdown()` before exit to flush queued spans (see Flushing Spans section)
|
||||
|
||||
**Traces missing:**
|
||||
- With `batch: true`: Call `await provider.shutdown()` before process exit to flush queued spans
|
||||
- Alternative: Set `batch: false` for immediate export (no shutdown needed)
|
||||
|
||||
**Missing attributes:**
|
||||
- Check instrumentation is registered (ESM requires manual setup)
|
||||
- See `instrumentation-auto-typescript.md`
|
||||
|
||||
## See Also
|
||||
|
||||
- **Auto-instrumentation:** `instrumentation-auto-typescript.md`
|
||||
- **Manual instrumentation:** `instrumentation-manual-typescript.md`
|
||||
- **API docs:** https://arize-ai.github.io/phoenix/
|
||||
@@ -0,0 +1,15 @@
|
||||
# AGENT Spans
|
||||
|
||||
AGENT spans represent autonomous reasoning blocks (ReAct agents, planning loops, multi-step decision making).
|
||||
|
||||
**Required:** `openinference.span.kind` = "AGENT"
|
||||
|
||||
## Example
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "AGENT",
|
||||
"input.value": "Book a flight to New York for next Monday",
|
||||
"output.value": "I've booked flight AA123 departing Monday at 9:00 AM"
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,43 @@
|
||||
# CHAIN Spans
|
||||
|
||||
## Purpose
|
||||
|
||||
CHAIN spans represent orchestration layers in your application (LangChain chains, custom workflows, application entry points). Often used as root spans.
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description | Required |
|
||||
| ------------------------- | ------ | --------------- | -------- |
|
||||
| `openinference.span.kind` | String | Must be "CHAIN" | Yes |
|
||||
|
||||
## Common Attributes
|
||||
|
||||
CHAIN spans typically use [Universal Attributes](fundamentals-universal-attributes.md):
|
||||
|
||||
- `input.value` - Input to the chain (user query, request payload)
|
||||
- `output.value` - Output from the chain (final response)
|
||||
- `input.mime_type` / `output.mime_type` - Format indicators
|
||||
|
||||
## Example: Root Chain
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "CHAIN",
|
||||
"input.value": "{\"question\": \"What is the capital of France?\"}",
|
||||
"input.mime_type": "application/json",
|
||||
"output.value": "{\"answer\": \"The capital of France is Paris.\", \"sources\": [\"doc_123\"]}",
|
||||
"output.mime_type": "application/json",
|
||||
"session.id": "session_abc123",
|
||||
"user.id": "user_xyz789"
|
||||
}
|
||||
```
|
||||
|
||||
## Example: Nested Sub-Chain
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "CHAIN",
|
||||
"input.value": "Summarize this document: ...",
|
||||
"output.value": "This document discusses..."
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,91 @@
|
||||
# EMBEDDING Spans
|
||||
|
||||
## Purpose
|
||||
|
||||
EMBEDDING spans represent vector generation operations (text-to-vector conversion for semantic search).
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description | Required |
|
||||
|-----------|------|-------------|----------|
|
||||
| `openinference.span.kind` | String | Must be "EMBEDDING" | Yes |
|
||||
| `embedding.model_name` | String | Embedding model identifier | Recommended |
|
||||
|
||||
## Attribute Reference
|
||||
|
||||
### Single Embedding
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `embedding.model_name` | String | Embedding model identifier |
|
||||
| `embedding.text` | String | Input text to embed |
|
||||
| `embedding.vector` | String (JSON array) | Generated embedding vector |
|
||||
|
||||
**Example:**
|
||||
```json
|
||||
{
|
||||
"embedding.model_name": "text-embedding-ada-002",
|
||||
"embedding.text": "What is machine learning?",
|
||||
"embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]"
|
||||
}
|
||||
```
|
||||
|
||||
### Batch Embeddings
|
||||
|
||||
| Attribute Pattern | Type | Description |
|
||||
|-------------------|------|-------------|
|
||||
| `embedding.embeddings.{i}.embedding.text` | String | Text at index i |
|
||||
| `embedding.embeddings.{i}.embedding.vector` | String (JSON array) | Vector at index i |
|
||||
|
||||
**Example:**
|
||||
```json
|
||||
{
|
||||
"embedding.model_name": "text-embedding-ada-002",
|
||||
"embedding.embeddings.0.embedding.text": "First document",
|
||||
"embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3, ..., 0.5]",
|
||||
"embedding.embeddings.1.embedding.text": "Second document",
|
||||
"embedding.embeddings.1.embedding.vector": "[0.6, 0.7, 0.8, ..., 0.9]"
|
||||
}
|
||||
```
|
||||
|
||||
### Vector Format
|
||||
|
||||
Vectors stored as JSON array strings:
|
||||
- Dimensions: Typically 384, 768, 1536, or 3072
|
||||
- Format: `"[0.123, -0.456, 0.789, ...]"`
|
||||
- Precision: Usually 3-6 decimal places
|
||||
|
||||
**Storage Considerations:**
|
||||
- Large vectors can significantly increase trace size
|
||||
- Consider omitting vectors in production (keep `embedding.text` for debugging)
|
||||
- Use separate vector database for actual similarity search
|
||||
|
||||
## Examples
|
||||
|
||||
### Single Embedding
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "EMBEDDING",
|
||||
"embedding.model_name": "text-embedding-ada-002",
|
||||
"embedding.text": "What is machine learning?",
|
||||
"embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]",
|
||||
"input.value": "What is machine learning?",
|
||||
"output.value": "[0.023, -0.012, 0.045, ..., 0.001]"
|
||||
}
|
||||
```
|
||||
|
||||
### Batch Embeddings
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "EMBEDDING",
|
||||
"embedding.model_name": "text-embedding-ada-002",
|
||||
"embedding.embeddings.0.embedding.text": "First document",
|
||||
"embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3]",
|
||||
"embedding.embeddings.1.embedding.text": "Second document",
|
||||
"embedding.embeddings.1.embedding.vector": "[0.4, 0.5, 0.6]",
|
||||
"embedding.embeddings.2.embedding.text": "Third document",
|
||||
"embedding.embeddings.2.embedding.vector": "[0.7, 0.8, 0.9]"
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,51 @@
|
||||
# EVALUATOR Spans
|
||||
|
||||
## Purpose
|
||||
|
||||
EVALUATOR spans represent quality assessment operations (answer relevance, faithfulness, hallucination detection).
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description | Required |
|
||||
|-----------|------|-------------|----------|
|
||||
| `openinference.span.kind` | String | Must be "EVALUATOR" | Yes |
|
||||
|
||||
## Common Attributes
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `input.value` | String | Content being evaluated |
|
||||
| `output.value` | String | Evaluation result (score, label, explanation) |
|
||||
| `metadata.evaluator_name` | String | Evaluator identifier |
|
||||
| `metadata.score` | Float | Numeric score (0-1) |
|
||||
| `metadata.label` | String | Categorical label (relevant/irrelevant) |
|
||||
|
||||
## Example: Answer Relevance
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "EVALUATOR",
|
||||
"input.value": "{\"question\": \"What is the capital of France?\", \"answer\": \"The capital of France is Paris.\"}",
|
||||
"input.mime_type": "application/json",
|
||||
"output.value": "0.95",
|
||||
"metadata.evaluator_name": "answer_relevance",
|
||||
"metadata.score": 0.95,
|
||||
"metadata.label": "relevant",
|
||||
"metadata.explanation": "Answer directly addresses the question with correct information"
|
||||
}
|
||||
```
|
||||
|
||||
## Example: Faithfulness Check
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "EVALUATOR",
|
||||
"input.value": "{\"context\": \"Paris is in France.\", \"answer\": \"Paris is the capital of France.\"}",
|
||||
"input.mime_type": "application/json",
|
||||
"output.value": "0.5",
|
||||
"metadata.evaluator_name": "faithfulness",
|
||||
"metadata.score": 0.5,
|
||||
"metadata.label": "partially_faithful",
|
||||
"metadata.explanation": "Answer makes unsupported claim about Paris being the capital"
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,49 @@
|
||||
# GUARDRAIL Spans
|
||||
|
||||
## Purpose
|
||||
|
||||
GUARDRAIL spans represent safety and policy checks (content moderation, PII detection, toxicity scoring).
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description | Required |
|
||||
|-----------|------|-------------|----------|
|
||||
| `openinference.span.kind` | String | Must be "GUARDRAIL" | Yes |
|
||||
|
||||
## Common Attributes
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `input.value` | String | Content being checked |
|
||||
| `output.value` | String | Guardrail result (allowed/blocked/flagged) |
|
||||
| `metadata.guardrail_type` | String | Type of check (toxicity, pii, bias) |
|
||||
| `metadata.score` | Float | Safety score (0-1) |
|
||||
| `metadata.threshold` | Float | Threshold for blocking |
|
||||
|
||||
## Example: Content Moderation
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "GUARDRAIL",
|
||||
"input.value": "User message: I want to build a bomb",
|
||||
"output.value": "BLOCKED",
|
||||
"metadata.guardrail_type": "content_moderation",
|
||||
"metadata.score": 0.95,
|
||||
"metadata.threshold": 0.7,
|
||||
"metadata.categories": "[\"violence\", \"weapons\"]",
|
||||
"metadata.action": "block_and_log"
|
||||
}
|
||||
```
|
||||
|
||||
## Example: PII Detection
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "GUARDRAIL",
|
||||
"input.value": "My SSN is 123-45-6789",
|
||||
"output.value": "FLAGGED",
|
||||
"metadata.guardrail_type": "pii_detection",
|
||||
"metadata.detected_pii": "[\"ssn\"]",
|
||||
"metadata.redacted_output": "My SSN is [REDACTED]"
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,79 @@
|
||||
# LLM Spans
|
||||
|
||||
Represent calls to language models (OpenAI, Anthropic, local models, etc.).
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `openinference.span.kind` | String | Must be "LLM" |
|
||||
| `llm.model_name` | String | Model identifier (e.g., "gpt-4", "claude-3-5-sonnet-20241022") |
|
||||
|
||||
## Key Attributes
|
||||
|
||||
| Category | Attributes | Example |
|
||||
|----------|------------|---------|
|
||||
| **Model** | `llm.model_name`, `llm.provider` | "gpt-4-turbo", "openai" |
|
||||
| **Tokens** | `llm.token_count.prompt`, `llm.token_count.completion`, `llm.token_count.total` | 25, 8, 33 |
|
||||
| **Cost** | `llm.cost.prompt`, `llm.cost.completion`, `llm.cost.total` | 0.0021, 0.0045, 0.0066 |
|
||||
| **Parameters** | `llm.invocation_parameters` (JSON) | `{"temperature": 0.7, "max_tokens": 1024}` |
|
||||
| **Messages** | `llm.input_messages.{i}.*`, `llm.output_messages.{i}.*` | See examples below |
|
||||
| **Tools** | `llm.tools.{i}.tool.json_schema` | Function definitions |
|
||||
|
||||
## Cost Tracking
|
||||
|
||||
**Core attributes:**
|
||||
- `llm.cost.prompt` - Total input cost (USD)
|
||||
- `llm.cost.completion` - Total output cost (USD)
|
||||
- `llm.cost.total` - Total cost (USD)
|
||||
|
||||
**Detailed cost breakdown:**
|
||||
- `llm.cost.prompt_details.{input,cache_read,cache_write,audio}` - Input cost components
|
||||
- `llm.cost.completion_details.{output,reasoning,audio}` - Output cost components
|
||||
|
||||
## Messages
|
||||
|
||||
**Input messages:**
|
||||
- `llm.input_messages.{i}.message.role` - "user", "assistant", "system", "tool"
|
||||
- `llm.input_messages.{i}.message.content` - Text content
|
||||
- `llm.input_messages.{i}.message.contents.{j}` - Multimodal (text + images)
|
||||
- `llm.input_messages.{i}.message.tool_calls` - Tool invocations
|
||||
|
||||
**Output messages:** Same structure as input messages.
|
||||
|
||||
## Example: Basic LLM Call
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "LLM",
|
||||
"llm.model_name": "claude-3-5-sonnet-20241022",
|
||||
"llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1024}",
|
||||
"llm.input_messages.0.message.role": "system",
|
||||
"llm.input_messages.0.message.content": "You are a helpful assistant.",
|
||||
"llm.input_messages.1.message.role": "user",
|
||||
"llm.input_messages.1.message.content": "What is the capital of France?",
|
||||
"llm.output_messages.0.message.role": "assistant",
|
||||
"llm.output_messages.0.message.content": "The capital of France is Paris.",
|
||||
"llm.token_count.prompt": 25,
|
||||
"llm.token_count.completion": 8,
|
||||
"llm.token_count.total": 33
|
||||
}
|
||||
```
|
||||
|
||||
## Example: LLM with Tool Calls
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "LLM",
|
||||
"llm.model_name": "gpt-4-turbo",
|
||||
"llm.input_messages.0.message.content": "What's the weather in SF?",
|
||||
"llm.output_messages.0.message.tool_calls.0.tool_call.function.name": "get_weather",
|
||||
"llm.output_messages.0.message.tool_calls.0.tool_call.function.arguments": "{\"location\": \"San Francisco\"}",
|
||||
"llm.tools.0.tool.json_schema": "{\"type\": \"function\", \"function\": {\"name\": \"get_weather\"}}"
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- **Instrumentation:** `instrumentation-auto-python.md`, `instrumentation-manual-python.md`
|
||||
- **Full spec:** https://github.com/Arize-ai/openinference/blob/main/spec/semantic_conventions.md
|
||||
@@ -0,0 +1,86 @@
|
||||
# RERANKER Spans
|
||||
|
||||
## Purpose
|
||||
|
||||
RERANKER spans represent reordering of retrieved documents (Cohere Rerank, cross-encoder models).
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description | Required |
|
||||
|-----------|------|-------------|----------|
|
||||
| `openinference.span.kind` | String | Must be "RERANKER" | Yes |
|
||||
|
||||
## Attribute Reference
|
||||
|
||||
### Reranker Parameters
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `reranker.model_name` | String | Reranker model identifier |
|
||||
| `reranker.query` | String | Query used for reranking |
|
||||
| `reranker.top_k` | Integer | Number of documents to return |
|
||||
|
||||
### Input Documents
|
||||
|
||||
| Attribute Pattern | Type | Description |
|
||||
|-------------------|------|-------------|
|
||||
| `reranker.input_documents.{i}.document.id` | String | Input document ID |
|
||||
| `reranker.input_documents.{i}.document.content` | String | Input document content |
|
||||
| `reranker.input_documents.{i}.document.score` | Float | Original retrieval score |
|
||||
| `reranker.input_documents.{i}.document.metadata` | String (JSON) | Document metadata |
|
||||
|
||||
### Output Documents
|
||||
|
||||
| Attribute Pattern | Type | Description |
|
||||
|-------------------|------|-------------|
|
||||
| `reranker.output_documents.{i}.document.id` | String | Output document ID (reordered) |
|
||||
| `reranker.output_documents.{i}.document.content` | String | Output document content |
|
||||
| `reranker.output_documents.{i}.document.score` | Float | New reranker score |
|
||||
| `reranker.output_documents.{i}.document.metadata` | String (JSON) | Document metadata |
|
||||
|
||||
### Score Comparison
|
||||
|
||||
Input scores (from retriever) vs. output scores (from reranker):
|
||||
|
||||
```json
|
||||
{
|
||||
"reranker.input_documents.0.document.id": "doc_A",
|
||||
"reranker.input_documents.0.document.score": 0.7,
|
||||
"reranker.input_documents.1.document.id": "doc_B",
|
||||
"reranker.input_documents.1.document.score": 0.9,
|
||||
"reranker.output_documents.0.document.id": "doc_B",
|
||||
"reranker.output_documents.0.document.score": 0.95,
|
||||
"reranker.output_documents.1.document.id": "doc_A",
|
||||
"reranker.output_documents.1.document.score": 0.85
|
||||
}
|
||||
```
|
||||
|
||||
In this example:
|
||||
- Input: doc_B (0.9) ranked higher than doc_A (0.7)
|
||||
- Output: doc_B still highest but both scores increased
|
||||
- Reranker confirmed retriever's ordering but refined scores
|
||||
|
||||
## Examples
|
||||
|
||||
### Complete Reranking Example
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "RERANKER",
|
||||
"reranker.model_name": "cohere-rerank-v2",
|
||||
"reranker.query": "What is machine learning?",
|
||||
"reranker.top_k": 2,
|
||||
"reranker.input_documents.0.document.id": "doc_123",
|
||||
"reranker.input_documents.0.document.content": "Machine learning is a subset...",
|
||||
"reranker.input_documents.1.document.id": "doc_456",
|
||||
"reranker.input_documents.1.document.content": "Supervised learning algorithms...",
|
||||
"reranker.input_documents.2.document.id": "doc_789",
|
||||
"reranker.input_documents.2.document.content": "Neural networks are...",
|
||||
"reranker.output_documents.0.document.id": "doc_456",
|
||||
"reranker.output_documents.0.document.content": "Supervised learning algorithms...",
|
||||
"reranker.output_documents.0.document.score": 0.95,
|
||||
"reranker.output_documents.1.document.id": "doc_123",
|
||||
"reranker.output_documents.1.document.content": "Machine learning is a subset...",
|
||||
"reranker.output_documents.1.document.score": 0.88
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,110 @@
|
||||
# RETRIEVER Spans
|
||||
|
||||
## Purpose
|
||||
|
||||
RETRIEVER spans represent document/context retrieval operations (vector DB queries, semantic search, keyword search).
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description | Required |
|
||||
|-----------|------|-------------|----------|
|
||||
| `openinference.span.kind` | String | Must be "RETRIEVER" | Yes |
|
||||
|
||||
## Attribute Reference
|
||||
|
||||
### Query
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `input.value` | String | Search query text |
|
||||
|
||||
### Document Schema
|
||||
|
||||
| Attribute Pattern | Type | Description |
|
||||
|-------------------|------|-------------|
|
||||
| `retrieval.documents.{i}.document.id` | String | Unique document identifier |
|
||||
| `retrieval.documents.{i}.document.content` | String | Document text content |
|
||||
| `retrieval.documents.{i}.document.score` | Float | Relevance score (0-1 or distance) |
|
||||
| `retrieval.documents.{i}.document.metadata` | String (JSON) | Document metadata |
|
||||
|
||||
### Flattening Pattern for Documents
|
||||
|
||||
Documents are flattened using zero-indexed notation:
|
||||
|
||||
```
|
||||
retrieval.documents.0.document.id
|
||||
retrieval.documents.0.document.content
|
||||
retrieval.documents.0.document.score
|
||||
retrieval.documents.1.document.id
|
||||
retrieval.documents.1.document.content
|
||||
retrieval.documents.1.document.score
|
||||
...
|
||||
```
|
||||
|
||||
### Document Metadata
|
||||
|
||||
Common metadata fields (stored as JSON string):
|
||||
|
||||
```json
|
||||
{
|
||||
"source": "knowledge_base.pdf",
|
||||
"page": 42,
|
||||
"section": "Introduction",
|
||||
"author": "Jane Doe",
|
||||
"created_at": "2024-01-15",
|
||||
"url": "https://example.com/doc",
|
||||
"chunk_id": "chunk_123"
|
||||
}
|
||||
```
|
||||
|
||||
**Example with metadata:**
|
||||
```json
|
||||
{
|
||||
"retrieval.documents.0.document.id": "doc_123",
|
||||
"retrieval.documents.0.document.content": "Machine learning is a method of data analysis...",
|
||||
"retrieval.documents.0.document.score": 0.92,
|
||||
"retrieval.documents.0.document.metadata": "{\"source\": \"ml_textbook.pdf\", \"page\": 15, \"chapter\": \"Introduction\"}"
|
||||
}
|
||||
```
|
||||
|
||||
### Ordering
|
||||
|
||||
Documents are ordered by index (0, 1, 2, ...). Typically:
|
||||
- Index 0 = highest scoring document
|
||||
- Index 1 = second highest
|
||||
- etc.
|
||||
|
||||
Preserve retrieval order in your flattened attributes.
|
||||
|
||||
### Large Document Handling
|
||||
|
||||
For very long documents:
|
||||
- Consider truncating `document.content` to first N characters
|
||||
- Store full content in separate document store
|
||||
- Use `document.id` to reference full content
|
||||
|
||||
## Examples
|
||||
|
||||
### Basic Vector Search
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "RETRIEVER",
|
||||
"input.value": "What is machine learning?",
|
||||
"retrieval.documents.0.document.id": "doc_123",
|
||||
"retrieval.documents.0.document.content": "Machine learning is a subset of artificial intelligence...",
|
||||
"retrieval.documents.0.document.score": 0.92,
|
||||
"retrieval.documents.0.document.metadata": "{\"source\": \"textbook.pdf\", \"page\": 42}",
|
||||
"retrieval.documents.1.document.id": "doc_456",
|
||||
"retrieval.documents.1.document.content": "Machine learning algorithms learn patterns from data...",
|
||||
"retrieval.documents.1.document.score": 0.87,
|
||||
"retrieval.documents.1.document.metadata": "{\"source\": \"article.html\", \"author\": \"Jane Doe\"}",
|
||||
"retrieval.documents.2.document.id": "doc_789",
|
||||
"retrieval.documents.2.document.content": "Supervised learning is a type of machine learning...",
|
||||
"retrieval.documents.2.document.score": 0.81,
|
||||
"retrieval.documents.2.document.metadata": "{\"source\": \"wiki.org\"}",
|
||||
"metadata.retriever_type": "vector_search",
|
||||
"metadata.vector_db": "pinecone",
|
||||
"metadata.top_k": 3
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,67 @@
|
||||
# TOOL Spans
|
||||
|
||||
## Purpose
|
||||
|
||||
TOOL spans represent external tool or function invocations (API calls, database queries, calculators, custom functions).
|
||||
|
||||
## Required Attributes
|
||||
|
||||
| Attribute | Type | Description | Required |
|
||||
| ------------------------- | ------ | ------------------ | ----------- |
|
||||
| `openinference.span.kind` | String | Must be "TOOL" | Yes |
|
||||
| `tool.name` | String | Tool/function name | Recommended |
|
||||
|
||||
## Attribute Reference
|
||||
|
||||
### Tool Execution Attributes
|
||||
|
||||
| Attribute | Type | Description |
|
||||
| ------------------ | ------------- | ------------------------------------------ |
|
||||
| `tool.name` | String | Tool/function name |
|
||||
| `tool.description` | String | Tool purpose/description |
|
||||
| `tool.parameters` | String (JSON) | JSON schema defining the tool's parameters |
|
||||
| `input.value` | String (JSON) | Actual input values passed to the tool |
|
||||
| `output.value` | String | Tool output/result |
|
||||
| `output.mime_type` | String | Result content type (e.g., "application/json") |
|
||||
|
||||
## Examples
|
||||
|
||||
### API Call Tool
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "TOOL",
|
||||
"tool.name": "get_weather",
|
||||
"tool.description": "Fetches current weather for a location",
|
||||
"tool.parameters": "{\"type\": \"object\", \"properties\": {\"location\": {\"type\": \"string\"}, \"units\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]}}, \"required\": [\"location\"]}",
|
||||
"input.value": "{\"location\": \"San Francisco\", \"units\": \"celsius\"}",
|
||||
"output.value": "{\"temperature\": 18, \"conditions\": \"partly cloudy\"}"
|
||||
}
|
||||
```
|
||||
|
||||
### Calculator Tool
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "TOOL",
|
||||
"tool.name": "calculator",
|
||||
"tool.description": "Performs mathematical calculations",
|
||||
"tool.parameters": "{\"type\": \"object\", \"properties\": {\"expression\": {\"type\": \"string\", \"description\": \"Math expression to evaluate\"}}, \"required\": [\"expression\"]}",
|
||||
"input.value": "{\"expression\": \"2 + 2\"}",
|
||||
"output.value": "4"
|
||||
}
|
||||
```
|
||||
|
||||
### Database Query Tool
|
||||
|
||||
```json
|
||||
{
|
||||
"openinference.span.kind": "TOOL",
|
||||
"tool.name": "sql_query",
|
||||
"tool.description": "Executes SQL query on user database",
|
||||
"tool.parameters": "{\"type\": \"object\", \"properties\": {\"query\": {\"type\": \"string\", \"description\": \"SQL query to execute\"}}, \"required\": [\"query\"]}",
|
||||
"input.value": "{\"query\": \"SELECT * FROM users WHERE id = 123\"}",
|
||||
"output.value": "[{\"id\": 123, \"name\": \"Alice\", \"email\": \"alice@example.com\"}]",
|
||||
"output.mime_type": "application/json"
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user