Understanding snapshot testing
Capture the output, diff on every run.
What snapshots catch that other tests don't, the false-positive trap, and where visual regression testing fits.
The idea.
A snapshot test runs the code, captures the output (rendered HTML, serialised JSON, stringified error message), and writes it to disk on first run. On every subsequent run the captured output is compared byte-for-byte against the saved version. Any divergence is a test failure — you either accept the new output ("update the snapshot") or fix the code. The technique scales to outputs too large to write assertions about by hand.
Three common kinds.
Text snapshots: error messages, formatted output, generated code. The file diff is easy to review. JSON/object snapshots: serialised data structures. JSON.stringify handles the formatting; reviewers see structural changes in their diff tool. Visual snapshots: rendered screenshots of UI components. Playwright, Chromatic, Percy. Catches the bugs that text-based snapshots can't — pixel shifts, colour changes, layout breaks.
The false-positive trap.
A team that snapshots everything ends up with hundreds of failures on every PR — a colour change, a copy edit, a date-format tweak — and the discipline of reviewing snapshots collapses into "press u to update them all". Bad snapshots are worse than no snapshots, because they erode trust. Reserve snapshots for outputs that should change rarely and where unexpected change is meaningful: error messages, public API response shapes, complex rendered components in canonical states.
A worked snapshot.
A React component renders a price label: $1,234.56. The test:expect(render(<Price value={1234.56} />)).toMatchSnapshot(). First run: saves __snapshots__/Price.test.tsx.snap with the rendered HTML. Second run: passes. A developer changes the formatter to $1234.56(no comma): the snapshot diff is visible, the reviewer rejects the change. The locale-aware formatter is preserved by the snapshot the developer would have forgotten to write a hand assertion for.
Price component
rendered → saved → diffed
One assertion; the diff does the work.
$1,234.56 saved ; $1234.56 fails
= Formatting drift caught
Visual regression is the strict cousin.
Playwright's toHaveScreenshot() takes a real PNG screenshot and compares pixel-by-pixel against the baseline. Tolerance for sub-pixel rendering differences is configurable. Tools like Chromatic, Percy, Argos run the comparison on cloud browsers and host a side-by-side diff view for review. Visual regression catches things HTML snapshots can't — CSS specificity bugs, font swaps, third-party stylesheets — at the cost of being more flaky and platform-dependent.
Storybook is the natural home.
A story file is already a curated set of canonical component states. Pointing snapshot testing at the Storybook stories — via storyshots,test-runner, or Chromatic — gives you a per-state snapshot for every component variant with zero additional test code to maintain. The story is the fixture; the snapshot is the assertion; the developer maintains one thing. That's the ergonomic peak of snapshot testing.