Visual unit tests for software agents.

Write a spec. Record the browser. Gemini judges the visual experience. Your agent gets a verdict and knows exactly what to fix.

Manual Setup
openlook://live-visual-unit-test
Input Spechomepage.yaml
Line 1-7
id: homepage-motion-graphic
url: http://localhost:3000
steps:
- Observe continuous loop animation
- Assert verdict fade-in easing
checks:
- id: verdict-aesthetic-correct
PLAYWRIGHT RECORDING
div.cta-btn
Simulating interactive user actions...
Gemini parsing video sequence frames...
Visual Unit Test Verdict
PASS
All visual checks passed. No regressions detected.
01
Spec
Steps + checks in YAML
02
Record
Playwright captures video
03
Analyze
Gemini watches the session
04
Verdict
Pass/fail per check with fixes

How it works

Three systems, three jobs. Your agent orchestrates. Playwright records. OpenLook judges.

1.
Write the spec
Define steps and checks in a YAML file. Steps tell the agent what to do in the browser. Checks tell Gemini what to evaluate from the recording.
2.
Record the session
The agent calls openlook_prepare_run, then drives the browser through Playwright MCP while recording every interaction as video.
3.
Get the verdict
The agent calls openlook_review. Gemini watches the recording, evaluates each check, and returns pass/fail with reasoning and fixes.
FAIL1 of 2 checks passed
pass
value-prop-clear
Hero explains the product and who it is for.
fail
cta-visible
Two buttons compete for attention. No dominant CTA.
Fix
Increase contrast on primary CTA. Remove or demote the secondary action.

Spec format

Where to go. What to do. What to check. Specs live in .openlook/ in your project root.

id: homepage-first-impression
url: http://localhost:3000

steps:
  - Open the homepage
  - Observe the first viewport without scrolling
  - Scroll once to see supporting content

checks:
  - id: value-prop-clear
    question: Can the user understand the product value?
    pass: The hero explains the product, who it is for, and why it matters.
    fail: The hero is vague, generic, or does not explain the product.

  - id: primary-action-visible
    question: Is the primary next action visually obvious?
    pass: One visually dominant CTA is easy to find near the hero.
    fail: No clear CTA, or multiple competing actions with equal weight.
id
Unique test name
url
Page to test
steps
Browser actions to perform
checks
Visual assertions for Gemini

Install

Choose your path: Bootstrap in one-click using a coding agent, or configure the steps manually.

Option A — Recommended

Agent Bootstrap

Paste a single, comprehensive setup prompt into your agent. It will install the openlook skill, register both MCP servers, and write agent visual testing rules to your project root.

Install OpenLook skill
Integrates browser visual testing workflows directly into your agent.
Configure MCP servers
Adds Playwright and OpenLook review tools to the agent environment.
Write AGENTS.md rules
Establishes visual unit testing as a standard for all future agent sessions.
Option B — Traditional

Manual Setup

Run the installation commands, define the required MCP servers in your environment, and write your first visual spec.

1. Add OpenLook skill
bunx skills add system1970/openlook-web
2. Configure MCP servers
{
  "mcpServers": {
    "openlook": {
      "command": "bunx",
      "args": ["-y", "openlook"],
      "env": { "GEMINI_API_KEY": "your_key" }
    },
    "playwright": {
      "command": "bunx",
      "args": ["-y", "@playwright/mcp@latest", "--caps=devtools"]
    }
  }
}
Visual specs live in .openlook/View format

What OpenLook catches

Problems that pass every unit test but fail every user.

The CTA renders but is visually invisible
The page loads but overwhelms the user
The form works but nobody knows what to do next
The layout is correct but feels broken
The onboarding flow completes but confuses at every step
The hero exists but communicates nothing