The Python SDK is a direct monkey-patch on the OpenAI / Anthropic client instance you pass in. It does not proxy requests — the call still goes directly from your app to the provider.

Install

pip install scopecall-py
# Or with provider extras (recommended):
pip install "scopecall-py[openai]"
pip install "scopecall-py[anthropic]"

Requires Python 3.10 or later (we use PEP 604 union syntax at module scope).

Initialise

import os
import scopecall

sdk = scopecall.init(
    api_key=os.environ["SCOPECALL_API_KEY"],
    endpoint="http://localhost:8080/v1/ingest",
)

Call init() exactly once at app startup. It returns the SDK instance you use to instrument clients and create traces.

`init` options

Option	Type	Required	Notes
`api_key`	`str`	✅	`sc_live_...` from Settings → API Keys
`endpoint`	`str`	✅	Full ingest URL
`default_prompt_version`	`str`	optional	Default prompt version label
`environment`	`str`	optional	Defaults to `os.environ.get("ENVIRONMENT", "production")`
`flush_interval_seconds`	`float`	optional	Defaults to `5.0`
`capture_content`	`bool`	optional	Defaults to `True`. Set `False` to omit prompt / response bodies.

Instrument a provider

OpenAI

from openai import OpenAI, AsyncOpenAI

client = sdk.instrument(OpenAI())            # sync
async_client = sdk.instrument(AsyncOpenAI()) # async — auto-detected

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
)

Streaming works out of the box. stream_options.include_usage=True is auto-added so the final chunk carries token counts.

Anthropic

from anthropic import Anthropic, AsyncAnthropic

client = sdk.instrument(Anthropic())
async_client = sdk.instrument(AsyncAnthropic())

response = client.messages.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)

Cost-attribution hierarchy (workflow → agent → step)

Wrap a sequence of LLM calls in sdk.workflow() to surface them as a single workflow in the trace tree and on the dashboard's Workflow Treemap. Optionally nest sdk.agent() and sdk.step() so the workflow detail page shows per-agent and per-step cost breakdowns.

with sdk.workflow("daily-summary", user_id="u_42", customer_id="customer_acme", feature_name="summary"):
    with sdk.step("draft"):
        draft = client.chat.completions.create(...)
    with sdk.step("polish"):
        polished = client.chat.completions.create(...)

Every LLM call inside the block becomes a child of the enclosing span. Identity (user, session, customer, feature, prompt version) propagates to every child automatically. sdk.trace() remains supported as a backward-compatible alias for sdk.workflow().

Workflow / agent / step options

Option	Type	Notes
`user_id`	`str`	End-user identity
`session_id`	`str`	Groups workflows into a conversation
`customer_id`	`str`	v0.3 — B2B tenant attribution (Customers page)
`feature_name`	`str`	High-level feature label
`prompt_version`	`str`	Per-trace prompt version override

Async + `contextvars` propagation

sdk.workflow() (and the sdk.agent() / sdk.step() aliases) uses contextvars, so the trace context propagates correctly across await, asyncio.create_task(), and asyncio.gather() — no manual context-passing.

import asyncio

async def main():
    with sdk.workflow("parallel-fanout", user_id="u_7"):
        results = await asyncio.gather(
            async_client.chat.completions.create(model="gpt-4o-mini", messages=[...]),
            async_client.chat.completions.create(model="gpt-4o-mini", messages=[...]),
        )
        # Both LLM calls show as children of the "parallel-fanout" workflow.

asyncio.run(main())

Manual `record_llm_call`

For frameworks ScopeCall doesn't auto-instrument (LangChain, LlamaIndex, custom wrappers), record events manually:

sdk.record_llm_call(
    model="gpt-4o-mini",
    provider="openai",
    input_tokens=240,
    output_tokens=180,
    latency_ms=820,
    input_text="...",
    output_text="...",
)

Inherits the current trace context (user / session / feature / prompt version) automatically. PII redaction is applied to input_text and output_text if configured.

FastAPI

from contextlib import asynccontextmanager
from fastapi import FastAPI
import scopecall

sdk: scopecall.ScopeCallSDK

@asynccontextmanager
async def lifespan(app: FastAPI):
    global sdk
    sdk = scopecall.init(api_key="...", endpoint="...")
    yield
    await sdk.close(timeout=5.0)  # graceful flush on shutdown

app = FastAPI(lifespan=lifespan)

A complete FastAPI example (including streaming SSE) is at sdks/python/examples/fastapi/ in the public repo.

Graceful shutdown

sdk.flush(timeout=5.0)   # blocking — ships buffered events
sdk.close(timeout=5.0)   # flush + close HTTP client

sdk.flush() is also called from FastAPI lifespan exit (above). Without it, the last few seconds of traces can be lost on container restart.

PII redaction

import re

sdk.add_redaction_pattern("EMAIL", re.compile(r"[\w.+-]+@[\w-]+\.[\w.-]+"), "[EMAIL]")

Redaction runs on both input_text and output_text before events leave the process. The defaults include patterns for emails, credit card numbers, SSNs, IP addresses, and phone numbers.

Source

The SDK source is open at github.com/scopecall/scopecall/tree/main/sdks/python.