E2E Testing with SpecSoloist
This guide explains how to write specs that generate Playwright end-to-end tests, and how to configure the arrangement for browser-based testing.
When to Use E2E Tests
Unit tests (pytest + Starlette TestClient, vitest) run in-process and cover logic, API routes, and component rendering. They cannot test:
- Page navigation and route transitions
- HTMX interactions in a browser context
- Form submission with real DOM events
- Hydration and JavaScript interactivity
- Visual regressions
Playwright tests cover these gaps by driving a real browser against a running server.
Separate Arrangements for E2E
Unit tests and E2E tests should be separate sp conduct targets with separate
arrangements. This keeps the slow browser-based tests out of the fast unit-test loop.
| Arrangement | Tests | Command | Speed |
|---|---|---|---|
arrangement.yaml |
Unit tests (pytest / vitest) | sp conduct specs/ |
Fast (~seconds) |
arrangement.e2e.yaml |
E2E tests (pytest-playwright / Playwright) | sp conduct specs/e2e/ |
Slow (~minutes) |
Keep E2E specs in a separate directory (e.g., specs/e2e/) or mark them clearly with
a naming convention (e.g., e2e_*.spec.md).
Selector Contract: data-testid
Playwright tests should use stable, semantic selectors rather than CSS classes or
XPath. The convention is data-testid attributes:
When a spec describes a UI component, include the data-testid for every interactive
element in the spec body. This creates an explicit contract between the implementation
spec and the E2E test spec:
In the component spec (layout.spec.md):
## `todo_item(text: str, index: int) -> Li`
Returns a single `<li data-testid="todo-item">` containing:
- The todo text in a `<span data-testid="todo-text">`
- A delete button `<button data-testid="delete-btn" hx_delete="/todos/{index}">`
In the E2E test spec (e2e_todos.spec.md):
The implementation picks up data-testid from the component spec; the E2E test picks
up the selector from the E2E spec. The contract is explicit and shared.
Test Scenario Format
E2E specs use a # Test Scenarios section with a table of user journeys. Each row
describes a complete interaction from the user's perspective — not a function call.
# Test Scenarios
| Scenario | Steps | Expected |
|----------|-------|----------|
| Page loads | Navigate to / | Page title is "Todos", list is empty |
| Add todo | Fill "Buy milk" in input, click Add | "Buy milk" appears in #todo-list |
| Delete todo | Add "Buy milk", click delete button | Item disappears from list |
| Empty add | Click Add with empty input | List unchanged |
Soloists translate each row into a test('...', async ({ page }) => { ... }) block
(TypeScript/Playwright) or a def test_...(page) function (pytest-playwright).
Rules for good test scenarios:
- One user action per scenario — don't combine unrelated interactions.
- Describe the user's intent, not the HTTP call ("Click Add", not "POST /todos").
- Include setup in Steps — if a scenario needs existing data, describe creating it.
- Expected is observable — visible text, URL change, element presence/absence.
Dev Server Setup
Playwright tests require a running server. Two approaches:
Option A: Subprocess fixture (recommended for FastHTML / Python)
The test file starts the server as a subprocess, waits for it to be ready, then tears it down after all tests complete. This is fully self-contained:
import subprocess, time, requests, pytest
@pytest.fixture(scope="session")
def live_server():
proc = subprocess.Popen(
["python", "src/routes.py"],
cwd=...,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
# Wait until server is ready
for _ in range(20):
try:
requests.get("http://localhost:5001/")
break
except Exception:
time.sleep(0.5)
yield "http://localhost:5001"
proc.terminate()
proc.wait()
Option B: playwright.config.ts webServer (Next.js)
For Next.js, use Playwright's built-in webServer option in playwright.config.ts:
// playwright.config.ts (generated by soloist)
import { defineConfig } from '@playwright/test';
export default defineConfig({
webServer: {
command: 'npm run dev',
port: 3000,
reuseExistingServer: !process.env.CI,
},
testDir: './e2e',
use: {
baseURL: 'http://localhost:3000',
},
});
Include playwright.config.ts generation in your arrangement's setup_commands or
as a generated file via config_files.
Arrangement Configuration
FastHTML (Python) + pytest-playwright
# arrangement.e2e.yaml
target_language: python
output_paths:
implementation: src/{name}.py
tests: tests/test_{name}.py
environment:
tools:
- uv
- pytest
setup_commands:
- uv sync
- uv run playwright install --with-deps
dependencies:
pytest-playwright: ">=0.5"
pytest: ">=7.0"
requests: ">=2.28"
build_commands:
compile: ""
lint: ""
test: uv run pytest tests/test_e2e_*.py -v
constraints:
- Tests use pytest-playwright; import `page` fixture from pytest
- Start the FastHTML server as a subprocess in a session-scoped fixture
- Wait for the server to be ready before running tests (poll /health or /)
- Use data-testid selectors: page.locator('[data-testid="..."]')
- Run headed=False by default (headless mode for CI)
Next.js + Playwright
Use the nextjs-playwright template from sp init --template nextjs-playwright.
It includes playwright.config.ts generation and the right npx playwright test
build command.
Mocking External APIs
E2E tests should mock calls to external services (OpenAI, databases, payment
providers) using Playwright's page.route() API (TypeScript) or
page.route() in pytest-playwright:
def test_ai_response(page, live_server):
# Mock the OpenAI API
page.route("**/api/chat", lambda route: route.fulfill(
status=200,
content_type="text/plain",
body="Mocked AI response"
))
page.goto(f"{live_server}/chat")
page.fill('[data-testid="message-input"]', "Hello")
page.click('[data-testid="send-btn"]')
page.wait_for_selector('[data-testid="ai-message"]')
assert "Mocked AI response" in page.text_content('[data-testid="ai-message"]')
Spec the mock boundary in the E2E test spec's # Test Scenarios table:
| AI chat | Fill "Hello", click Send (mock returns "Mocked AI response") | "Mocked AI response" appears in chat |
The parenthetical (mock returns "...") tells the soloist to set up a route mock for
that scenario.
Headless Mode and CI
Playwright runs headless by default. Explicitly confirm this in the arrangement's
constraints:
For CI pipelines, the arrangement's setup_commands should install Playwright's
browser binaries:
Use chromium only (not all browsers) to keep CI fast. Full cross-browser testing is
a separate concern.
Complete Example: FastHTML Todo App
See examples/fasthtml_app/specs/e2e_todos.spec.md for a worked example covering:
- Subprocess server fixture pattern
- data-testid selector contract
- User-journey scenario table
- Dependency declaration
The generated test lives in examples/fasthtml_app/tests/test_e2e_todos.py.