Visual unit tests for software agents.

Write a spec. Record the browser. Gemini judges the visual experience. Your agent gets a verdict and knows exactly what to fix.

Manual Setup

openlook://live-visual-unit-test

Input Spechomepage.yaml

Line 1-7

id: homepage-motion-graphic
url: http://localhost:3000
steps:
- Observe continuous loop animation
- Assert verdict fade-in easing
checks:
- id: verdict-aesthetic-correct

PLAYWRIGHT RECORDING

div.cta-btn

Simulating interactive user actions...

Gemini parsing video sequence frames...

Visual Unit Test Verdict

PASS

All visual checks passed. No regressions detected.

Spec

Steps + checks in YAML

Record

Playwright captures video

Analyze

Gemini watches the session

Verdict

Pass/fail per check with fixes

How it works

Three systems, three jobs. Your agent orchestrates. Playwright records. OpenLook judges.

Write the spec

Define steps and checks in a YAML file. Steps tell the agent what to do in the browser. Checks tell Gemini what to evaluate from the recording.

Record the session

The agent calls openlook_prepare_run, then drives the browser through Playwright MCP while recording every interaction as video.

Get the verdict

The agent calls openlook_review. Gemini watches the recording, evaluates each check, and returns pass/fail with reasoning and fixes.

FAIL1 of 2 checks passed

pass

value-prop-clear

Hero explains the product and who it is for.

fail

cta-visible

Two buttons compete for attention. No dominant CTA.

Fix

Increase contrast on primary CTA. Remove or demote the secondary action.

Spec format

Where to go. What to do. What to check. Specs live in .openlook/ in your project root.

id: homepage-first-impression
url: http://localhost:3000

steps:
  - Open the homepage
  - Observe the first viewport without scrolling
  - Scroll once to see supporting content

checks:
  - id: value-prop-clear
    question: Can the user understand the product value?
    pass: The hero explains the product, who it is for, and why it matters.
    fail: The hero is vague, generic, or does not explain the product.

  - id: primary-action-visible
    question: Is the primary next action visually obvious?
    pass: One visually dominant CTA is easy to find near the hero.
    fail: No clear CTA, or multiple competing actions with equal weight.

Unique test name

url

Page to test

steps

Browser actions to perform

checks

Visual assertions for Gemini

Install

Choose your path: Bootstrap in one-click using a coding agent, or configure the steps manually.

Option A — Recommended

Agent Bootstrap

Paste a single, comprehensive setup prompt into your agent. It will install the openlook skill, register both MCP servers, and write agent visual testing rules to your project root.

Install OpenLook skill

Integrates browser visual testing workflows directly into your agent.

Configure MCP servers

Adds Playwright and OpenLook review tools to the agent environment.

Write AGENTS.md rules

Establishes visual unit testing as a standard for all future agent sessions.

Option B — Traditional

Manual Setup

Run the installation commands, define the required MCP servers in your environment, and write your first visual spec.

1. Add OpenLook skill

bunx skills add system1970/openlook-web

2. Configure MCP servers

{
  "mcpServers": {
    "openlook": {
      "command": "bunx",
      "args": ["-y", "openlook"],
      "env": { "GEMINI_API_KEY": "your_key" }
    },
    "playwright": {
      "command": "bunx",
      "args": ["-y", "@playwright/mcp@latest", "--caps=devtools"]
    }
  }
}

Visual specs live in .openlook/View format

What OpenLook catches

Problems that pass every unit test but fail every user.

—The CTA renders but is visually invisible

—The page loads but overwhelms the user

—The form works but nobody knows what to do next

—The layout is correct but feels broken

—The onboarding flow completes but confuses at every step

—The hero exists but communicates nothing