Cassette docs
Record your LLM/agent API calls once, replay them in tests — fast, free, deterministic. The Team tier adds a shared registry and a GitHub PR gate that flags real behaviour regressions.
What it is
Tests that call real models are slow, cost tokens on every run, and flake because output varies. Cassette records each response to a local file on the first run, then replays it on every run after — no network, no API key, no flakiness. The recorder is free and open source (MIT).
Install
pip install cassette-sdk # Python
npm install cassette-sdk # Node / TypeScript
Python
from cassette.recorder import http_client
from openai import OpenAI
client = OpenAI(http_client=http_client(project="demo")) # records → replays locally
TypeScript (vitest/jest)
import OpenAI from "openai";
import { recordingFetch } from "cassette-sdk/recorder";
const client = new OpenAI({ fetch: recordingFetch({ project: "demo" }) });
Record / replay / auto
Set CASSETTE_MODE:
| Mode | Behaviour |
|---|---|
record | always call the real API and save the response |
replay | only use saved cassettes; error on a miss |
auto (default) | replay if a cassette exists, else record — fails safe |
Cassettes live in ./.cassettes as plain JSON and diff cleanly in PRs.
Behaviour drift
The gate doesn't byte-diff — it understands LLM output. Verdicts:
| Verdict | Meaning |
|---|---|
identical | byte-for-byte same |
benign | only free-text wording changed (non-determinism) — fine |
regression | tool calls, structured-output shape, or stop reason changed — blocks merge |
Cassette format
A portable JSON file per interaction (the ".har of agent test traffic"). See
SPEC.md. Importers from
vcrpy / nock are on the roadmap.
Set up the team gate
- Subscribe on the pricing page → you'll land on a welcome
page with your
CASSETTE_TOKEN. - Add
CASSETTE_TOKENas a repo secret (Settings → Secrets and variables → Actions). - Install the GitHub App on your repo (link is on the welcome page).
- Add the CI workflow below.
CI workflow
Copy examples/ci/cassette.yml
to .github/workflows/cassette.yml:
name: cassette
on: [pull_request]
jobs:
agent-tests:
runs-on: ubuntu-latest
env:
CASSETTE_MODE: auto
CASSETTE_PROJECT: ${{ github.repository }}
CASSETTE_TOKEN: ${{ secrets.CASSETTE_TOKEN }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: '3.11' }
- run: pip install cassette-sdk pytest
- run: pytest
- if: always()
run: cassette push --ref "pr-${{ github.event.number }}"
CLI reference
cassette push --ref pr-42 # upload recorded cassettes for a PR
cassette push --ref blessed # promote to the baseline (run on main)
Reads CASSETTE_PROJECT, CASSETTE_TOKEN, CASSETTE_DIR from env.
FAQ
Does replay need an API key?
No. Only recording new cassettes calls the real model.
What if the model reworded its answer?
That's benign — the gate ignores wording and only blocks real behaviour changes.
Is my data used to train anything?
No. Cassettes are never used for cross-customer training unless you explicitly opt in. See the privacy policy.
Can I self-host?
The recorder is fully offline OSS. The shared registry + gate are the hosted Team tier.