mirror of
https://github.com/github/awesome-copilot.git
synced 2026-04-11 10:45:56 +00:00
* Add 9 Arize LLM observability skills Add skills for Arize AI platform covering trace export, instrumentation, datasets, experiments, evaluators, AI provider integrations, annotations, prompt optimization, and deep linking to the Arize UI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add 3 Phoenix AI observability skills Add skills for Phoenix (Arize open-source) covering CLI debugging, LLM evaluation workflows, and OpenInference tracing/instrumentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Ignoring intentional bad spelling * Fix CI: remove .DS_Store from generated skills README and add codespell ignore Remove .DS_Store artifact from winmd-api-search asset listing in generated README.skills.md so it matches the CI Linux build output. Add queston to codespell ignore list (intentional misspelling example in arize-dataset skill). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add arize-ax and phoenix plugins Bundle the 9 Arize skills into an arize-ax plugin and the 3 Phoenix skills into a phoenix plugin for easier installation as single packages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix skill folder structures to match source repos Move arize supporting files from references/ to root level and rename phoenix references/ to rules/ to exactly match the original source repository folder structures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fixing file locations * Fixing readme --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6.1 KiB
6.1 KiB
name, description, license, compatibility, metadata
| name | description | license | compatibility | metadata | ||||
|---|---|---|---|---|---|---|---|---|
| phoenix-cli | Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, inspect datasets, and query the GraphQL API. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues. | Apache-2.0 | Requires Node.js (for npx) or global install of @arizeai/phoenix-cli. Optionally requires jq for JSON processing. |
|
Phoenix CLI
Invocation
px <resource> <action> # if installed globally
npx @arizeai/phoenix-cli <resource> <action> # no install required
The CLI uses singular resource commands with subcommands like list and get:
px trace list
px trace get <trace-id>
px span list
px dataset list
px dataset get <name>
Setup
export PHOENIX_HOST=http://localhost:6006
export PHOENIX_PROJECT=my-project
export PHOENIX_API_KEY=your-api-key # if auth is enabled
Always use --format raw --no-progress when piping to jq.
Traces
px trace list --limit 20 --format raw --no-progress | jq .
px trace list --last-n-minutes 60 --limit 20 --format raw --no-progress | jq '.[] | select(.status == "ERROR")'
px trace list --format raw --no-progress | jq 'sort_by(-.duration) | .[0:5]'
px trace get <trace-id> --format raw | jq .
px trace get <trace-id> --format raw | jq '.spans[] | select(.status_code != "OK")'
Spans
px span list --limit 20 # recent spans (table view)
px span list --last-n-minutes 60 --limit 50 # spans from last hour
px span list --span-kind LLM --limit 10 # only LLM spans
px span list --status-code ERROR --limit 20 # only errored spans
px span list --name chat_completion --limit 10 # filter by span name
px span list --trace-id <id> --format raw --no-progress | jq . # all spans for a trace
px span list --include-annotations --limit 10 # include annotation scores
px span list output.json --limit 100 # save to JSON file
px span list --format raw --no-progress | jq '.[] | select(.status_code == "ERROR")'
Span JSON shape
Span
name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT"|"RERANKER"|"GUARDRAIL"|"EVALUATOR"|"UNKNOWN")
status_code ("OK"|"ERROR"|"UNSET"), status_message
context.span_id, context.trace_id, parent_id
start_time, end_time
attributes (same as trace span attributes above)
annotations[] (with --include-annotations)
name, result { score, label, explanation }
Trace JSON shape
Trace
traceId, status ("OK"|"ERROR"), duration (ms), startTime, endTime
rootSpan — top-level span (parent_id: null)
spans[]
name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT")
status_code ("OK"|"ERROR"), parent_id, context.span_id
attributes
input.value, output.value — raw input/output
llm.model_name, llm.provider
llm.token_count.prompt/completion/total
llm.token_count.prompt_details.cache_read
llm.token_count.completion_details.reasoning
llm.input_messages.{N}.message.role/content
llm.output_messages.{N}.message.role/content
llm.invocation_parameters — JSON string (temperature, etc.)
exception.message — set if span errored
Sessions
px session list --limit 10 --format raw --no-progress | jq .
px session list --order asc --format raw --no-progress | jq '.[].session_id'
px session get <session-id> --format raw | jq .
px session get <session-id> --include-annotations --format raw | jq '.annotations'
Session JSON shape
SessionData
id, session_id, project_id
start_time, end_time
traces[]
id, trace_id, start_time, end_time
SessionAnnotation (with --include-annotations)
id, name, annotator_kind ("LLM"|"CODE"|"HUMAN"), session_id
result { label, score, explanation }
metadata, identifier, source, created_at, updated_at
Datasets / Experiments / Prompts
px dataset list --format raw --no-progress | jq '.[].name'
px dataset get <name> --format raw | jq '.examples[] | {input, output: .expected_output}'
px experiment list --dataset <name> --format raw --no-progress | jq '.[] | {id, name, failed_run_count}'
px experiment get <id> --format raw --no-progress | jq '.[] | select(.error != null) | {input, error}'
px prompt list --format raw --no-progress | jq '.[].name'
px prompt get <name> --format text --no-progress # plain text, ideal for piping to AI
GraphQL
For ad-hoc queries not covered by the commands above. Output is {"data": {...}}.
px api graphql '{ projectCount datasetCount promptCount evaluatorCount }'
px api graphql '{ projects { edges { node { name traceCount tokenCountTotal } } } }' | jq '.data.projects.edges[].node'
px api graphql '{ datasets { edges { node { name exampleCount experimentCount } } } }' | jq '.data.datasets.edges[].node'
px api graphql '{ evaluators { edges { node { name kind } } } }' | jq '.data.evaluators.edges[].node'
# Introspect any type
px api graphql '{ __type(name: "Project") { fields { name type { name } } } }' | jq '.data.__type.fields[]'
Key root fields: projects, datasets, prompts, evaluators, projectCount, datasetCount, promptCount, evaluatorCount, viewer.
Docs
Download Phoenix documentation markdown for local use by coding agents.
px docs fetch # fetch default workflow docs to .px/docs
px docs fetch --workflow tracing # fetch only tracing docs
px docs fetch --workflow tracing --workflow evaluation
px docs fetch --dry-run # preview what would be downloaded
px docs fetch --refresh # clear .px/docs and re-download
px docs fetch --output-dir ./my-docs # custom output directory
Key options: --workflow (repeatable, values: tracing, evaluation, datasets, prompts, integrations, sdk, self-hosting, all), --dry-run, --refresh, --output-dir (default .px/docs), --workers (default 10).