Cassette docs

Record your LLM/agent API calls once, replay them in tests — fast, free, deterministic. The Team tier adds a shared registry and a GitHub PR gate that flags real behaviour regressions.

What it is

Tests that call real models are slow, cost tokens on every run, and flake because output varies. Cassette records each response to a local file on the first run, then replays it on every run after — no network, no API key, no flakiness. The recorder is free and open source (MIT).

Install

pip install cassette-sdk      # Python
npm install cassette-sdk      # Node / TypeScript

Python

from cassette.recorder import http_client
from openai import OpenAI

client = OpenAI(http_client=http_client(project="demo"))  # records → replays locally

TypeScript (vitest/jest)

import OpenAI from "openai";
import { recordingFetch } from "cassette-sdk/recorder";

const client = new OpenAI({ fetch: recordingFetch({ project: "demo" }) });

Record / replay / auto

Set CASSETTE_MODE:

Mode	Behaviour
`record`	always call the real API and save the response
`replay`	only use saved cassettes; error on a miss
`auto` (default)	replay if a cassette exists, else record — fails safe

Cassettes live in ./.cassettes as plain JSON and diff cleanly in PRs.

Behaviour drift

The gate doesn't byte-diff — it understands LLM output. Verdicts:

Verdict	Meaning
`identical`	byte-for-byte same
`benign`	only free-text wording changed (non-determinism) — fine
`regression`	tool calls, structured-output shape, or stop reason changed — blocks merge

Cassette format

A portable JSON file per interaction (the ".har of agent test traffic"). See SPEC.md. Importers from vcrpy / nock are on the roadmap.

Set up the team gate

Subscribe on the pricing page → you'll land on a welcome page with your CASSETTE_TOKEN.
Add CASSETTE_TOKEN as a repo secret (Settings → Secrets and variables → Actions).
Install the GitHub App on your repo (link is on the welcome page).
Add the CI workflow below.

The gate posts a ✓/✗ check on each PR. A human on your side approves or rejects every behaviour change — Cassette never merges for you.

CI workflow

Copy examples/ci/cassette.yml to .github/workflows/cassette.yml:

name: cassette
on: [pull_request]
jobs:
  agent-tests:
    runs-on: ubuntu-latest
    env:
      CASSETTE_MODE: auto
      CASSETTE_PROJECT: ${{ github.repository }}
      CASSETTE_TOKEN: ${{ secrets.CASSETTE_TOKEN }}
      OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.11' }
      - run: pip install cassette-sdk pytest
      - run: pytest
      - if: always()
        run: cassette push --ref "pr-${{ github.event.number }}"

CLI reference

cassette push --ref pr-42         # upload recorded cassettes for a PR
cassette push --ref blessed       # promote to the baseline (run on main)

Reads CASSETTE_PROJECT, CASSETTE_TOKEN, CASSETTE_DIR from env.

FAQ

Does replay need an API key?

No. Only recording new cassettes calls the real model.

What if the model reworded its answer?

That's benign — the gate ignores wording and only blocks real behaviour changes.

Is my data used to train anything?

No. Cassettes are never used for cross-customer training unless you explicitly opt in. See the privacy policy.

Can I self-host?

The recorder is fully offline OSS. The shared registry + gate are the hosted Team tier.

Terms · Privacy · DPA