Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,48 +1,17 @@
# FXA Testing Guidelines

Shared reference for `fxa-test-draft`, `fxa-test-repair`, `fxa-test-independence`, and the FXA review skills (`fxa-review`, `fxa-review-quick`). This document is the single source of truth for FXA test rules; the bullets in `CLAUDE.md` Section 8 mirror the rule set and ordering here. If the two diverge, treat this file as authoritative and update `CLAUDE.md`.

For React/TSX tests in `fxa-settings`, also apply the React-specific testing patterns in `.claude/skills/fxa-check-react/SKILL.md` (querying by role, `userEvent` over `fireEvent`, avoiding refs/instances in assertions). The rules below apply to all FXA tests; the React skill covers the additional concerns specific to component tests.

## Test layers and shift-left

FXA has three test layers, each costlier than the last:

- **Unit** (`*.spec.ts` / `*.test.ts` / `*.test.tsx`, `nx test-unit`) — pure functions, services, hooks, components in isolation. Milliseconds per test, no real I/O. The default place to test business rules.
- **Integration** (`nx test-integration`) — wiring across units, often hitting real infrastructure (MySQL, Redis). Reserve for verifying the integration itself, not for branching business logic.
- **Functional / E2E** (Playwright in `packages/functional-tests`) — user-facing flows end-to-end. Slowest and most brittle; reserve for critical happy paths and regressions.

**Shift left.** When code has nuanced business rules — validation, calculation, normalization, decision logic — extract it into a pure function or service and test it directly. Route handlers and React components should be thin shells; their tests cover wiring (auth, request/response shape, error propagation), not every branch of the underlying rule. Each branch tested at the route/component level multiplies the cost (Hapi harness, mocked DB/Redis, React provider tree); the same branch as a unit test runs in milliseconds with no harness.

```ts
// Rule extracted into a pure function; cheap and exhaustive to unit-test
export function calculateDiscount(plan: Plan, account: Account): number {
if (account.delinquentSince) return 0;
return plan.tier === 'enterprise' ? 0.2 : 0.1;
}

it.each([
{ tier: 'free', expected: 0.1 },
{ tier: 'enterprise', expected: 0.2 },
])('returns a $expected discount for the $tier tier', ({ tier, expected }) => {...});
it('returns 0 when the account is delinquent', () => {...});
---
paths:
- "**/*.test.ts"
- "**/*.test.tsx"
- "**/*.spec.ts"
---

// Route handler is a thin shell; route tests cover wiring only
async function getDiscountHandler(request) {
const account = await db.getAccount(request.auth.credentials.uid);
return { discount: calculateDiscount(account.plan, account) };
}
# FXA Testing Rules

it('returns 401 without a session token', () => {...});
it('returns the discount in the response body', () => {...});
```
Applies whenever you read or edit a Jest test file in FXA. These are the same rules the testing skills (`/fxa-test-draft`, `/fxa-test-repair`, `/fxa-test-independence`) and the review skills (`/fxa-review`, `/fxa-review-quick`) enforce — load them on every test-file touch so they apply to ad-hoc edits, not only skill-invoked workflows.

**Anti-patterns.**
- Hapi route tests covering every business-logic branch via per-fixture `server.inject()` calls — extract the rule and unit-test it.
- Component tests building provider trees to verify validation or formatters that could be pure-function / hook unit tests.
- Playwright tests asserting on copy or branch outcomes outside a critical user flow.
For React component tests (`*.test.tsx`), additional React-specific rules load from `.claude/rules/testing/react.md`.

**Signal to extract.** When a single route or component has more than ~3 tests differing only in input shape, the underlying logic should likely be extracted and unit-tested directly.
**Shift-left is a golden goal**, not a hard rule. The full guidance — which layer to test at, when to extract business logic into a pure function or hook — lives in `CLAUDE.md` Section 8 and loads at session start. While editing a test file, the practical heuristic to keep in mind: when a route or component has more than ~3 tests differing only in input shape, flag the opportunity to extract the rule and unit-test it directly.

---

Expand Down Expand Up @@ -128,7 +97,7 @@ it('throws NotFound when the account does not exist', async () => {
it('wraps a DB connection failure in AppError.backendServiceFailure', async () => {
db.getAccountByEmail.mockRejectedValue(new Error('ECONNREFUSED'));
await expect(getAccount('user@example.com')).rejects.toThrow(
AppError.backendServiceFailure() // Leverage the actual AppError thrown instead of recreating the error object
AppError.backendServiceFailure()
);
});
```
Expand Down Expand Up @@ -251,7 +220,7 @@ beforeEach(() => { account = { ...MOCK_ACCOUNT }; });

// Violation — band-aid mid-test reset hides state leaking in from another test
it('sends one email per signup', () => {
jest.clearAllMocks(); // why is this needed here? root cause is elsewhere
jest.clearAllMocks(); // why is this needed here? root cause is elsewhere
signup(MOCK_ACCOUNT);
expect(mailer.send).toHaveBeenCalledTimes(1);
});
Expand Down Expand Up @@ -375,7 +344,7 @@ When adding tests to an existing file, write new tests to these standards. Don't

**Focused tests must never be committed.** `it.only`, `describe.only`, `fit`, and `fdescribe` cause Jest to silently skip everything else in the file or suite, shrinking the safety net to a single test. A passing CI run on one focused test gives false confidence. These are debugging tools — strip them before commit.

**Skipped tests are acceptable when justified.** `it.skip`, `xit`, `xdescribe`, and `it.todo` are fine in committed code *only* when accompanied by a comment explaining **why** the test is skipped. Prefer a follow-up reference (Jira ticket, GitHub issue) so the skip is trackable; at minimum, a short rationale that lets a future reader decide whether the skip is still warranted. An unjustified `.skip` rots into permanent dead weight — if there's no plan to re-enable it and no reason worth recording, delete the test instead.
**Skipped tests are acceptable when justified.** `it.skip`, `xit`, `xdescribe`, and `it.todo` are fine in committed code *only* when accompanied by a comment explaining **why** the test is skipped. Prefer a follow-up reference (Jira ticket, GitHub issue) so the skip is trackable; at minimum, a short rationale that lets a future reader decide whether the skip is still warranted. An unjustified `.skip` rots into permanent dead weight — if there's no plan to re-enable it and no reason worth recording, delete the test instead. Suggest creating a ticket to track the work if there is no existing one.

```ts
// Violation — focused tests, never commit
Expand Down
123 changes: 123 additions & 0 deletions .claude/rules/testing/react.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
paths:
- "**/*.test.tsx"
---

# FXA React Test Rules

Applies whenever you read or edit a React component test (`*.test.tsx`) anywhere in FXA. Stacks on top of `.claude/rules/testing/base.md`, which covers the general FXA testing rules (naming, mocking, isolation, async, assertions). This file adds the rules specific to component tests.

---

## 1. Query like a user, not by implementation detail

Tests should select elements the way a user finds them. Reach for Testing Library queries that mirror screen-reader / keyboard / visual cues, in this order of preference:

1. `getByRole(name)` — matches accessible role and accessible name
2. `getByLabelText` — form inputs
3. `getByText` / `getByPlaceholderText` / `getByDisplayValue`
4. `getByTestId` — last resort, only when no semantic query fits

**Avoid:** `className` lookups, raw CSS selectors, refs, component instances. These bind tests to the DOM structure rather than the user-facing contract.

```tsx
// Violation — couples test to class name; brittle to CSS refactors
const button = container.querySelector('.submit-btn');

// Violation — testid used when a semantic query fits
const button = screen.getByTestId('submit-button');

// Correct — accessible role + name; mirrors how a user (or AT) finds the button
const button = screen.getByRole('button', { name: /sign in/i });
```

---

## 2. Use `userEvent`, not `fireEvent`

`userEvent` dispatches the full event sequence a real user produces (focus → keydown → input → change → blur). `fireEvent` fires a single synthetic event and misses bugs that depend on the surrounding sequence — focus traps, controlled-input handlers that read from `event.target`, debounced inputs, etc.

```tsx
// Violation — single synthetic event; misses focus / keydown / blur side effects
fireEvent.change(input, { target: { value: 'user@example.com' } });
fireEvent.click(submitButton);

// Correct — full interaction sequence
const user = userEvent.setup();
await user.type(input, 'user@example.com');
await user.click(submitButton);
```

---

## 3. Assert on rendered output, not internals

Component tests should verify what the user sees and can interact with — DOM content, accessible names, ARIA states, presence/absence of elements. Don't reach into `instance()`, `state()`, refs, or private hooks.

```tsx
// Violation — reads private internal state via an Enzyme-style escape hatch
expect(wrapper.instance().state.isSubmitting).toBe(true);

// Correct — asserts the user-observable contract
expect(screen.getByRole('button', { name: /submit/i })).toBeDisabled();
expect(screen.getByText(/submitting…/i)).toBeInTheDocument();
```

---

## 4. Prefer `findBy*` / `waitFor` over manual `act()`

Most async Testing Library helpers wrap `act()` internally. Reach for them before wrapping state updates in `act` yourself. Manual `act()` is appropriate only when no async helper fits the flow.

```tsx
// Violation — manual act() where findBy would do
act(() => {
fireEvent.click(button);
});
const message = screen.getByText(/saved/i); // may not be there yet

// Correct — findBy waits and acts internally
await user.click(button);
const message = await screen.findByText(/saved/i);
```

Un-`act`'d state updates produce console warnings and flake — treat the warning as a real failure signal, don't silence it.

---

## 5. Avoid whole-tree snapshots for non-trivial components

Large `toMatchSnapshot()` outputs get regenerated without scrutiny on every change and stop catching regressions. Assert on specific user-observable properties instead: text content, ARIA attributes, presence of specific elements.

Snapshots are still fine for **small, stable, hard-to-articulate output** — e.g., a serialized error message, a generated query string, an SVG path. Pick that case deliberately.

```tsx
// Violation — broad snapshot; reviewers rubber-stamp regenerations
expect(container).toMatchSnapshot();

// Correct — assert what the test actually cares about
expect(screen.getByRole('heading', { name: /welcome/i })).toBeInTheDocument();
expect(screen.getByRole('button', { name: /continue/i })).toBeEnabled();
```

---

## 6. Extract provider boilerplate

If every test in the suite wraps the component in the same chain (`MockedProvider`, `IntlProvider`, Router, theme), extract a `renderWithProviders(ui, options)` helper in a test util and reuse it. Inline duplication invites drift — one test gets a tweaked provider config, then assertions diverge for reasons unrelated to the component.

```tsx
// Violation — same provider tree, copy-pasted across every test
render(
<MockedProvider mocks={mocks}>
<IntlProvider locale="en">
<Router>
<MyComponent {...props} />
</Router>
</IntlProvider>
</MockedProvider>
);

// Correct — one helper, used everywhere
renderWithProviders(<MyComponent {...props} />, { mocks });
```
27 changes: 1 addition & 26 deletions .claude/skills/fxa-check-react/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,32 +135,7 @@ Focus on `.tsx`, `.ts`, `.jsx`, `.js` files. Read additional context from relate

### 5. Testing Patterns (React/TSX)

These are the React-specific test concerns to apply on top of the general FXA testing rules in `.claude/skills/fxa-testing-shared/GUIDELINES.md`. Apply both: the shared rules cover assertions, mocking, isolation, and async correctness; this section covers what's unique to component tests.

**Querying by implementation detail**
- Tests selecting by `className`, CSS selectors, `data-testid` (when a semantic query exists), refs, or component instances
- Recommendation: prefer Testing Library queries that mirror how users find elements — `getByRole` (with `name` option), `getByLabelText`, `getByText`. `data-testid` is a last resort when no semantic query fits.

**`fireEvent` instead of `userEvent`**
- `fireEvent.click(...)`, `fireEvent.change(...)` for user interactions
- Recommendation: use `userEvent` (`@testing-library/user-event`); it dispatches the full event sequence (focus → keydown → input → change → blur) that real interactions produce, surfacing bugs `fireEvent` misses

**Asserting on component internals**
- Reading `instance()`, `state()`, refs, or private hooks via Enzyme-style or test-only escape hatches
- Recommendation: assert on rendered output (DOM, accessible names, ARIA states) — the user-visible contract, not the implementation

**Missing or misused `act()`**
- State updates in tests not wrapped in `act` produce console warnings and can cause flakiness
- Most async Testing Library helpers (`findBy*`, `waitFor`) wrap `act` internally — reach for them before manually wrapping
- Recommendation: prefer `findBy*` / `waitFor` for async UI changes; only manually `act(() => ...)` when no helper fits

**Snapshot tests for non-trivial components**
- Large `toMatchSnapshot()` outputs that get regenerated without scrutiny on every change
- Recommendation: assert on specific user-observable properties (text content, ARIA attributes, presence of specific elements) instead of whole-tree snapshots; reserve snapshots for small, stable, hard-to-articulate output (e.g., serialized error messages)

**Provider boilerplate duplicated per test**
- Each test re-wrapping the component in the same chain of `MockedProvider` / `IntlProvider` / Router providers
- Recommendation: extract a `renderWithProviders(ui, options)` helper in a test util; reuse across the suite
When reviewing `*.test.tsx` changes, apply the React testing rules in `.claude/rules/testing/react.md` in addition to the general FXA rules in `.claude/rules/testing/base.md`. Both rules auto-load when Claude reads a matching test file; this skill's pre-flight should explicitly `Read` them too so reviews cover test changes consistently. Headline triggers to flag: implementation-detail queries (className/CSS/data-testid where semantic queries fit), `fireEvent` where `userEvent` belongs, asserting on instances/refs/private state, missing or misused `act()`, whole-tree snapshots for non-trivial components, and duplicated provider boilerplate.

---

Expand Down
4 changes: 2 additions & 2 deletions .claude/skills/fxa-review-quick/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Evaluate the diff through these lenses, in order of priority:

**4. Tests**

Apply the FXA testing rules in `.claude/skills/fxa-testing-shared/GUIDELINES.md` (read it before reviewing test changes). Common quick-review triggers:
Apply the FXA testing rules in `.claude/rules/testing/base.md` and (for `*.test.tsx` changes) `.claude/rules/testing/react.md` — read them before reviewing test changes. Common quick-review triggers:

- New auth-server source files without co-located `*.spec.ts`; fxa-settings uses `*.test.tsx` convention
- `jest.clearAllMocks()` in `beforeEach` — redundant only when the package's `jest.config.*` sets `clearMocks: true` (currently `fxa-auth-server` and `fxa-profile-server`); do not flag in other packages without checking their config
Expand All @@ -80,7 +80,7 @@ Apply the FXA testing rules in `.claude/skills/fxa-testing-shared/GUIDELINES.md`
- Flag patterns likely to cause open handle warnings (unclosed connections, uncleared timers)
- Flag missing `act()` wrapping in React test state updates
- Committed `it.only`, `describe.only`, `fit`, `fdescribe` — silently skips the rest of the suite, never acceptable. `it.skip`, `xit`, `xdescribe`, `it.todo` — flag only when there is no comment explaining the skip; a justified skip (ideally with a Jira/issue reference) is acceptable
- For `*.test.tsx` in `fxa-settings`: also apply `.claude/skills/fxa-check-react/SKILL.md` Section 5 (querying by role, `userEvent` over `fireEvent`, no asserting on refs/instances)
- For `*.test.tsx` in `fxa-settings`: also apply `.claude/rules/testing/react.md` (querying by role, `userEvent` over `fireEvent`, no asserting on refs/instances)
- Route/component tests with many fixtures-per-branch (>~3 tests differing only in input shape) — heavy branching at the route/component level signals the rule should be extracted and unit-tested directly; reserve route/component tests for wiring

**5. Database Migrations**
Expand Down
6 changes: 3 additions & 3 deletions .claude/skills/fxa-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ Output JSON array with fields: severity, category ("Logic/Bugs"), subcategory, f
- model: opus
- description: FXA test quality review

Tell this agent it is a QA engineer who understands FXA's testing patterns and common flakiness causes. Tell it to read `.claude/skills/fxa-testing-shared/GUIDELINES.md` before reviewing — that file is the authoritative rule set; the bullets below are the FXA-specific triggers to apply on top of it.
Tell this agent it is a QA engineer who understands FXA's testing patterns and common flakiness causes. Tell it to read `.claude/rules/testing/base.md` before reviewing (and `.claude/rules/testing/react.md` if reviewing any `*.test.tsx`) — those rules are the authoritative rule set; the bullets below are the FXA-specific triggers to apply on top of them.

- Check new auth-server source files have co-located `*.spec.ts`; fxa-settings uses `*.test.tsx` convention
- Flag new Mocha tests — all new tests must be Jest
Expand All @@ -125,8 +125,8 @@ Tell this agent it is a QA engineer who understands FXA's testing patterns and c
- Flag `forEach` / `for` loops with `expect` calls inside a single `it()` body — use `it.each` so each case is named and independently reported
- Flag committed `it.only`, `describe.only`, `fit`, `fdescribe` (silently skips the rest of the suite — CRITICAL/HIGH if it lands in CI). For `it.skip`, `xit`, `xdescribe`, `it.todo`: only flag when there is no adjacent comment explaining the skip — a justified skip with a Jira/issue reference is acceptable per the guidelines
- Flag when there is a jira ticket number in a test name. We do not put tickets in test names.
- For `*.test.tsx` files in `fxa-settings` and other React packages: also apply the React testing patterns in `.claude/skills/fxa-check-react/SKILL.md` Section 5 (queries by role, `userEvent` over `fireEvent`, no asserting on instances/refs, snapshot scope).
- **Layer / shift-left:** flag route handler tests or React component tests that branch-cover business logic via repeated fixtures (more than ~3 tests differing only in input shape). The right fix is to extract the rule into a pure function/service and unit-test it directly; route/component tests should cover wiring (auth, request/response shape, error propagation, rendering), not internal business branching. See "Test layers and shift-left" in `.claude/skills/fxa-testing-shared/GUIDELINES.md`.
- For `*.test.tsx` files in `fxa-settings` and other React packages: also apply the React testing rules in `.claude/rules/testing/react.md` (queries by role, `userEvent` over `fireEvent`, no asserting on instances/refs, snapshot scope).
- **Layer / shift-left (golden goal):** flag route handler tests or React component tests that branch-cover business logic via repeated fixtures (more than ~3 tests differing only in input shape). The preferred fix is to extract the rule into a pure function/service and unit-test it directly; route/component tests should cover wiring (auth, request/response shape, error propagation, rendering), not internal business branching. Shifting left isn't required for every change, but always flag the opportunity. See "Shift-left is a golden goal" in `CLAUDE.md` Section 8.

Output JSON array with fields: severity, category ("Test Quality"), subcategory, file, line, issue, recommendation.

Expand Down
Loading