E2E Testing with SpecSoloist

This guide explains how to write specs that generate Playwright end-to-end tests, and how to configure the arrangement for browser-based testing.

When to Use E2E Tests

Unit tests (pytest + Starlette TestClient, vitest) run in-process and cover logic, API routes, and component rendering. They cannot test:

Page navigation and route transitions
HTMX interactions in a browser context
Form submission with real DOM events
Hydration and JavaScript interactivity
Visual regressions

Playwright tests cover these gaps by driving a real browser against a running server.

Separate Arrangements for E2E

Unit tests and E2E tests should be separate sp conduct targets with separate arrangements. This keeps the slow browser-based tests out of the fast unit-test loop.

Arrangement	Tests	Command	Speed
`arrangement.yaml`	Unit tests (pytest / vitest)	`sp conduct specs/`	Fast (~seconds)
`arrangement.e2e.yaml`	E2E tests (pytest-playwright / Playwright)	`sp conduct specs/e2e/`	Slow (~minutes)

Keep E2E specs in a separate directory (e.g., specs/e2e/) or mark them clearly with a naming convention (e.g., e2e_*.spec.md).

Selector Contract: `data-testid`

Playwright tests should use stable, semantic selectors rather than CSS classes or XPath. The convention is data-testid attributes:

page.locator('[data-testid="add-btn"]')

When a spec describes a UI component, include the data-testid for every interactive element in the spec body. This creates an explicit contract between the implementation spec and the E2E test spec:

In the component spec (layout.spec.md):

## `todo_item(text: str, index: int) -> Li`

Returns a single `<li data-testid="todo-item">` containing:
- The todo text in a `<span data-testid="todo-text">`
- A delete button `<button data-testid="delete-btn" hx_delete="/todos/{index}">`

In the E2E test spec (e2e_todos.spec.md):

| Delete todo | Click `[data-testid="delete-btn"]` on first item | Item disappears |

The implementation picks up data-testid from the component spec; the E2E test picks up the selector from the E2E spec. The contract is explicit and shared.

Test Scenario Format

E2E specs use a # Test Scenarios section with a table of user journeys. Each row describes a complete interaction from the user's perspective — not a function call.

# Test Scenarios

| Scenario | Steps | Expected |
|----------|-------|----------|
| Page loads | Navigate to / | Page title is "Todos", list is empty |
| Add todo | Fill "Buy milk" in input, click Add | "Buy milk" appears in #todo-list |
| Delete todo | Add "Buy milk", click delete button | Item disappears from list |
| Empty add | Click Add with empty input | List unchanged |

Soloists translate each row into a test('...', async ({ page }) => { ... }) block (TypeScript/Playwright) or a def test_...(page) function (pytest-playwright).

Rules for good test scenarios: - One user action per scenario — don't combine unrelated interactions. - Describe the user's intent, not the HTTP call ("Click Add", not "POST /todos"). - Include setup in Steps — if a scenario needs existing data, describe creating it. - Expected is observable — visible text, URL change, element presence/absence.

Dev Server Setup

Playwright tests require a running server. Two approaches:

Option A: Subprocess fixture (recommended for FastHTML / Python)

The test file starts the server as a subprocess, waits for it to be ready, then tears it down after all tests complete. This is fully self-contained:

import subprocess, time, requests, pytest

@pytest.fixture(scope="session")
def live_server():
    proc = subprocess.Popen(
        ["python", "src/routes.py"],
        cwd=...,
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )
    # Wait until server is ready
    for _ in range(20):
        try:
            requests.get("http://localhost:5001/")
            break
        except Exception:
            time.sleep(0.5)
    yield "http://localhost:5001"
    proc.terminate()
    proc.wait()

Option B: `playwright.config.ts` webServer (Next.js)

For Next.js, use Playwright's built-in webServer option in playwright.config.ts:

// playwright.config.ts (generated by soloist)
import { defineConfig } from '@playwright/test';

export default defineConfig({
  webServer: {
    command: 'npm run dev',
    port: 3000,
    reuseExistingServer: !process.env.CI,
  },
  testDir: './e2e',
  use: {
    baseURL: 'http://localhost:3000',
  },
});

Include playwright.config.ts generation in your arrangement's setup_commands or as a generated file via config_files.

Arrangement Configuration

FastHTML (Python) + pytest-playwright

# arrangement.e2e.yaml
target_language: python

output_paths:
  implementation: src/{name}.py
  tests: tests/test_{name}.py

environment:
  tools:
    - uv
    - pytest
  setup_commands:
    - uv sync
    - uv run playwright install --with-deps
  dependencies:
    pytest-playwright: ">=0.5"
    pytest: ">=7.0"
    requests: ">=2.28"

build_commands:
  compile: ""
  lint: ""
  test: uv run pytest tests/test_e2e_*.py -v

constraints:
  - Tests use pytest-playwright; import `page` fixture from pytest
  - Start the FastHTML server as a subprocess in a session-scoped fixture
  - Wait for the server to be ready before running tests (poll /health or /)
  - Use data-testid selectors: page.locator('[data-testid="..."]')
  - Run headed=False by default (headless mode for CI)

Next.js + Playwright

Use the nextjs-playwright template from sp init --template nextjs-playwright. It includes playwright.config.ts generation and the right npx playwright test build command.

Mocking External APIs

E2E tests should mock calls to external services (OpenAI, databases, payment providers) using Playwright's page.route() API (TypeScript) or page.route() in pytest-playwright:

def test_ai_response(page, live_server):
    # Mock the OpenAI API
    page.route("**/api/chat", lambda route: route.fulfill(
        status=200,
        content_type="text/plain",
        body="Mocked AI response"
    ))
    page.goto(f"{live_server}/chat")
    page.fill('[data-testid="message-input"]', "Hello")
    page.click('[data-testid="send-btn"]')
    page.wait_for_selector('[data-testid="ai-message"]')
    assert "Mocked AI response" in page.text_content('[data-testid="ai-message"]')

Spec the mock boundary in the E2E test spec's # Test Scenarios table:

| AI chat | Fill "Hello", click Send (mock returns "Mocked AI response") | "Mocked AI response" appears in chat |

The parenthetical (mock returns "...") tells the soloist to set up a route mock for that scenario.

Headless Mode and CI

Playwright runs headless by default. Explicitly confirm this in the arrangement's constraints:

constraints:
  - Run in headless mode (no --headed flag)

For CI pipelines, the arrangement's setup_commands should install Playwright's browser binaries:

setup_commands:
  - npm ci
  - npx playwright install --with-deps chromium

Use chromium only (not all browsers) to keep CI fast. Full cross-browser testing is a separate concern.

Complete Example: FastHTML Todo App

See examples/fasthtml_app/specs/e2e_todos.spec.md for a worked example covering: - Subprocess server fixture pattern - data-testid selector contract - User-journey scenario table - Dependency declaration

The generated test lives in examples/fasthtml_app/tests/test_e2e_todos.py.