mozilla · nshirley · May 21, 2026
@@ -1,48 +1,17 @@
-# FXA Testing Guidelines
-
-Shared reference for `fxa-test-draft`, `fxa-test-repair`, `fxa-test-independence`, and the FXA review skills (`fxa-review`, `fxa-review-quick`). This document is the single source of truth for FXA test rules; the bullets in `CLAUDE.md` Section 8 mirror the rule set and ordering here. If the two diverge, treat this file as authoritative and update `CLAUDE.md`.
-
-For React/TSX tests in `fxa-settings`, also apply the React-specific testing patterns in `.claude/skills/fxa-check-react/SKILL.md` (querying by role, `userEvent` over `fireEvent`, avoiding refs/instances in assertions). The rules below apply to all FXA tests; the React skill covers the additional concerns specific to component tests.
-
-## Test layers and shift-left
-
-FXA has three test layers, each costlier than the last:
-
-- **Unit** (`*.spec.ts` / `*.test.ts` / `*.test.tsx`, `nx test-unit`) — pure functions, services, hooks, components in isolation. Milliseconds per test, no real I/O. The default place to test business rules.
-- **Integration** (`nx test-integration`) — wiring across units, often hitting real infrastructure (MySQL, Redis). Reserve for verifying the integration itself, not for branching business logic.
-- **Functional / E2E** (Playwright in `packages/functional-tests`) — user-facing flows end-to-end. Slowest and most brittle; reserve for critical happy paths and regressions.
-
-**Shift left.** When code has nuanced business rules — validation, calculation, normalization, decision logic — extract it into a pure function or service and test it directly. Route handlers and React components should be thin shells; their tests cover wiring (auth, request/response shape, error propagation), not every branch of the underlying rule. Each branch tested at the route/component level multiplies the cost (Hapi harness, mocked DB/Redis, React provider tree); the same branch as a unit test runs in milliseconds with no harness.
-
-```ts
-// Rule extracted into a pure function; cheap and exhaustive to unit-test
-export function calculateDiscount(plan: Plan, account: Account): number {
-  if (account.delinquentSince) return 0;
-  return plan.tier === 'enterprise' ? 0.2 : 0.1;
-}
-
-it.each([
-  { tier: 'free', expected: 0.1 },
-  { tier: 'enterprise', expected: 0.2 },
-])('returns a $expected discount for the $tier tier', ({ tier, expected }) => {...});
-it('returns 0 when the account is delinquent', () => {...});
+---
+paths:
+  - "**/*.test.ts"
+  - "**/*.test.tsx"
+  - "**/*.spec.ts"
+---
 
-// Route handler is a thin shell; route tests cover wiring only
-async function getDiscountHandler(request) {
-  const account = await db.getAccount(request.auth.credentials.uid);
-  return { discount: calculateDiscount(account.plan, account) };
-}
+# FXA Testing Rules
 
-it('returns 401 without a session token', () => {...});
-it('returns the discount in the response body', () => {...});
-```
+Applies whenever you read or edit a Jest test file in FXA. These are the same rules the testing skills (`/fxa-test-draft`, `/fxa-test-repair`, `/fxa-test-independence`) and the review skills (`/fxa-review`, `/fxa-review-quick`) enforce — load them on every test-file touch so they apply to ad-hoc edits, not only skill-invoked workflows.
 
-**Anti-patterns.**
-- Hapi route tests covering every business-logic branch via per-fixture `server.inject()` calls — extract the rule and unit-test it.
-- Component tests building provider trees to verify validation or formatters that could be pure-function / hook unit tests.
-- Playwright tests asserting on copy or branch outcomes outside a critical user flow.
+For React component tests (`*.test.tsx`), additional React-specific rules load from `.claude/rules/testing/react.md`.
 
-**Signal to extract.** When a single route or component has more than ~3 tests differing only in input shape, the underlying logic should likely be extracted and unit-tested directly.
+**Shift-left is a golden goal**, not a hard rule. The full guidance — which layer to test at, when to extract business logic into a pure function or hook — lives in `CLAUDE.md` Section 8 and loads at session start. While editing a test file, the practical heuristic to keep in mind: when a route or component has more than ~3 tests differing only in input shape, flag the opportunity to extract the rule and unit-test it directly.
 
 ---
 
@@ -128,7 +97,7 @@ it('throws NotFound when the account does not exist', async () => {
 it('wraps a DB connection failure in AppError.backendServiceFailure', async () => {
   db.getAccountByEmail.mockRejectedValue(new Error('ECONNREFUSED'));
   await expect(getAccount('user@example.com')).rejects.toThrow(
-    AppError.backendServiceFailure() // Leverage the actual AppError thrown instead of recreating the error object
+    AppError.backendServiceFailure()
   );
 });
 ```
@@ -251,7 +220,7 @@ beforeEach(() => { account = { ...MOCK_ACCOUNT }; });
 
 // Violation — band-aid mid-test reset hides state leaking in from another test
 it('sends one email per signup', () => {
-  jest.clearAllMocks(); // ⚠ why is this needed here? root cause is elsewhere
+  jest.clearAllMocks(); // why is this needed here? root cause is elsewhere
   signup(MOCK_ACCOUNT);
   expect(mailer.send).toHaveBeenCalledTimes(1);
 });
@@ -375,7 +344,7 @@ When adding tests to an existing file, write new tests to these standards. Don't
 
 **Focused tests must never be committed.** `it.only`, `describe.only`, `fit`, and `fdescribe` cause Jest to silently skip everything else in the file or suite, shrinking the safety net to a single test. A passing CI run on one focused test gives false confidence. These are debugging tools — strip them before commit.
 
-**Skipped tests are acceptable when justified.** `it.skip`, `xit`, `xdescribe`, and `it.todo` are fine in committed code *only* when accompanied by a comment explaining **why** the test is skipped. Prefer a follow-up reference (Jira ticket, GitHub issue) so the skip is trackable; at minimum, a short rationale that lets a future reader decide whether the skip is still warranted. An unjustified `.skip` rots into permanent dead weight — if there's no plan to re-enable it and no reason worth recording, delete the test instead.
+**Skipped tests are acceptable when justified.** `it.skip`, `xit`, `xdescribe`, and `it.todo` are fine in committed code *only* when accompanied by a comment explaining **why** the test is skipped. Prefer a follow-up reference (Jira ticket, GitHub issue) so the skip is trackable; at minimum, a short rationale that lets a future reader decide whether the skip is still warranted. An unjustified `.skip` rots into permanent dead weight — if there's no plan to re-enable it and no reason worth recording, delete the test instead. Suggest creating a ticket to track the work if there is no existing one.
 
 ```ts
 // Violation — focused tests, never commit

@@ -0,0 +1,123 @@
+---
+paths:
+  - "**/*.test.tsx"
+---
+
+# FXA React Test Rules
+
+Applies whenever you read or edit a React component test (`*.test.tsx`) anywhere in FXA. Stacks on top of `.claude/rules/testing/base.md`, which covers the general FXA testing rules (naming, mocking, isolation, async, assertions). This file adds the rules specific to component tests.
+
+---
+
+## 1. Query like a user, not by implementation detail
+
+Tests should select elements the way a user finds them. Reach for Testing Library queries that mirror screen-reader / keyboard / visual cues, in this order of preference:
+
+1. `getByRole(name)` — matches accessible role and accessible name
+2. `getByLabelText` — form inputs
+3. `getByText` / `getByPlaceholderText` / `getByDisplayValue`
+4. `getByTestId` — last resort, only when no semantic query fits
+
+**Avoid:** `className` lookups, raw CSS selectors, refs, component instances. These bind tests to the DOM structure rather than the user-facing contract.
+
+```tsx
+// Violation — couples test to class name; brittle to CSS refactors
+const button = container.querySelector('.submit-btn');
+
+// Violation — testid used when a semantic query fits
+const button = screen.getByTestId('submit-button');
+
+// Correct — accessible role + name; mirrors how a user (or AT) finds the button
+const button = screen.getByRole('button', { name: /sign in/i });
+```
+
+---
+
+## 2. Use `userEvent`, not `fireEvent`
+
+`userEvent` dispatches the full event sequence a real user produces (focus → keydown → input → change → blur). `fireEvent` fires a single synthetic event and misses bugs that depend on the surrounding sequence — focus traps, controlled-input handlers that read from `event.target`, debounced inputs, etc.
+
+```tsx
+// Violation — single synthetic event; misses focus / keydown / blur side effects
+fireEvent.change(input, { target: { value: 'user@example.com' } });
+fireEvent.click(submitButton);
+
+// Correct — full interaction sequence
+const user = userEvent.setup();
+await user.type(input, 'user@example.com');
+await user.click(submitButton);
+```
+
+---
+
+## 3. Assert on rendered output, not internals
+
+Component tests should verify what the user sees and can interact with — DOM content, accessible names, ARIA states, presence/absence of elements. Don't reach into `instance()`, `state()`, refs, or private hooks.
+
+```tsx
+// Violation — reads private internal state via an Enzyme-style escape hatch
+expect(wrapper.instance().state.isSubmitting).toBe(true);
+
+// Correct — asserts the user-observable contract
+expect(screen.getByRole('button', { name: /submit/i })).toBeDisabled();
+expect(screen.getByText(/submitting…/i)).toBeInTheDocument();
+```
+
+---
+
+## 4. Prefer `findBy*` / `waitFor` over manual `act()`
+
+Most async Testing Library helpers wrap `act()` internally. Reach for them before wrapping state updates in `act` yourself. Manual `act()` is appropriate only when no async helper fits the flow.
+
+```tsx
+// Violation — manual act() where findBy would do
+act(() => {
+  fireEvent.click(button);
+});
+const message = screen.getByText(/saved/i); // may not be there yet
+
+// Correct — findBy waits and acts internally
+await user.click(button);
+const message = await screen.findByText(/saved/i);
+```
+
+Un-`act`'d state updates produce console warnings and flake — treat the warning as a real failure signal, don't silence it.
+
+---
+
+## 5. Avoid whole-tree snapshots for non-trivial components
+
+Large `toMatchSnapshot()` outputs get regenerated without scrutiny on every change and stop catching regressions. Assert on specific user-observable properties instead: text content, ARIA attributes, presence of specific elements.
+
+Snapshots are still fine for **small, stable, hard-to-articulate output** — e.g., a serialized error message, a generated query string, an SVG path. Pick that case deliberately.
+
+```tsx
+// Violation — broad snapshot; reviewers rubber-stamp regenerations
+expect(container).toMatchSnapshot();
+
+// Correct — assert what the test actually cares about
+expect(screen.getByRole('heading', { name: /welcome/i })).toBeInTheDocument();
+expect(screen.getByRole('button', { name: /continue/i })).toBeEnabled();
+```
+
+---
+
+## 6. Extract provider boilerplate
+
+If every test in the suite wraps the component in the same chain (`MockedProvider`, `IntlProvider`, Router, theme), extract a `renderWithProviders(ui, options)` helper in a test util and reuse it. Inline duplication invites drift — one test gets a tweaked provider config, then assertions diverge for reasons unrelated to the component.
+
+```tsx
+// Violation — same provider tree, copy-pasted across every test
+render(
+  <MockedProvider mocks={mocks}>
+    <IntlProvider locale="en">
+      <Router>
+        <MyComponent {...props} />
+      </Router>
+    </IntlProvider>
+  </MockedProvider>
+);
+
+// Correct — one helper, used everywhere
+renderWithProviders(<MyComponent {...props} />, { mocks });
+```
@@ -135,32 +135,7 @@ Focus on `.tsx`, `.ts`, `.jsx`, `.js` files. Read additional context from relate
 
 ### 5. Testing Patterns (React/TSX)
 
-These are the React-specific test concerns to apply on top of the general FXA testing rules in `.claude/skills/fxa-testing-shared/GUIDELINES.md`. Apply both: the shared rules cover assertions, mocking, isolation, and async correctness; this section covers what's unique to component tests.
-
-**Querying by implementation detail**
-- Tests selecting by `className`, CSS selectors, `data-testid` (when a semantic query exists), refs, or component instances
-- Recommendation: prefer Testing Library queries that mirror how users find elements — `getByRole` (with `name` option), `getByLabelText`, `getByText`. `data-testid` is a last resort when no semantic query fits.
-
-**`fireEvent` instead of `userEvent`**
-- `fireEvent.click(...)`, `fireEvent.change(...)` for user interactions
-- Recommendation: use `userEvent` (`@testing-library/user-event`); it dispatches the full event sequence (focus → keydown → input → change → blur) that real interactions produce, surfacing bugs `fireEvent` misses
-
-**Asserting on component internals**
-- Reading `instance()`, `state()`, refs, or private hooks via Enzyme-style or test-only escape hatches
-- Recommendation: assert on rendered output (DOM, accessible names, ARIA states) — the user-visible contract, not the implementation
-
-**Missing or misused `act()`**
-- State updates in tests not wrapped in `act` produce console warnings and can cause flakiness
-- Most async Testing Library helpers (`findBy*`, `waitFor`) wrap `act` internally — reach for them before manually wrapping
-- Recommendation: prefer `findBy*` / `waitFor` for async UI changes; only manually `act(() => ...)` when no helper fits
-
-**Snapshot tests for non-trivial components**
-- Large `toMatchSnapshot()` outputs that get regenerated without scrutiny on every change
-- Recommendation: assert on specific user-observable properties (text content, ARIA attributes, presence of specific elements) instead of whole-tree snapshots; reserve snapshots for small, stable, hard-to-articulate output (e.g., serialized error messages)
-
-**Provider boilerplate duplicated per test**
-- Each test re-wrapping the component in the same chain of `MockedProvider` / `IntlProvider` / Router providers
-- Recommendation: extract a `renderWithProviders(ui, options)` helper in a test util; reuse across the suite
+When reviewing `*.test.tsx` changes, apply the React testing rules in `.claude/rules/testing/react.md` in addition to the general FXA rules in `.claude/rules/testing/base.md`. Both rules auto-load when Claude reads a matching test file; this skill's pre-flight should explicitly `Read` them too so reviews cover test changes consistently. Headline triggers to flag: implementation-detail queries (className/CSS/data-testid where semantic queries fit), `fireEvent` where `userEvent` belongs, asserting on instances/refs/private state, missing or misused `act()`, whole-tree snapshots for non-trivial components, and duplicated provider boilerplate.
 
 ---
 

@@ -63,7 +63,7 @@ Evaluate the diff through these lenses, in order of priority:
 
 **4. Tests**
 
-Apply the FXA testing rules in `.claude/skills/fxa-testing-shared/GUIDELINES.md` (read it before reviewing test changes). Common quick-review triggers:
+Apply the FXA testing rules in `.claude/rules/testing/base.md` and (for `*.test.tsx` changes) `.claude/rules/testing/react.md` — read them before reviewing test changes. Common quick-review triggers:
 
 - New auth-server source files without co-located `*.spec.ts`; fxa-settings uses `*.test.tsx` convention
 - `jest.clearAllMocks()` in `beforeEach` — redundant only when the package's `jest.config.*` sets `clearMocks: true` (currently `fxa-auth-server` and `fxa-profile-server`); do not flag in other packages without checking their config
@@ -80,7 +80,7 @@ Apply the FXA testing rules in `.claude/skills/fxa-testing-shared/GUIDELINES.md`
 - Flag patterns likely to cause open handle warnings (unclosed connections, uncleared timers)
 - Flag missing `act()` wrapping in React test state updates
 - Committed `it.only`, `describe.only`, `fit`, `fdescribe` — silently skips the rest of the suite, never acceptable. `it.skip`, `xit`, `xdescribe`, `it.todo` — flag only when there is no comment explaining the skip; a justified skip (ideally with a Jira/issue reference) is acceptable
-- For `*.test.tsx` in `fxa-settings`: also apply `.claude/skills/fxa-check-react/SKILL.md` Section 5 (querying by role, `userEvent` over `fireEvent`, no asserting on refs/instances)
+- For `*.test.tsx` in `fxa-settings`: also apply `.claude/rules/testing/react.md` (querying by role, `userEvent` over `fireEvent`, no asserting on refs/instances)
 - Route/component tests with many fixtures-per-branch (>~3 tests differing only in input shape) — heavy branching at the route/component level signals the rule should be extracted and unit-tested directly; reserve route/component tests for wiring
 
 **5. Database Migrations**

@@ -105,7 +105,7 @@ Output JSON array with fields: severity, category ("Logic/Bugs"), subcategory, f
 - model: opus
 - description: FXA test quality review
 
-Tell this agent it is a QA engineer who understands FXA's testing patterns and common flakiness causes. Tell it to read `.claude/skills/fxa-testing-shared/GUIDELINES.md` before reviewing — that file is the authoritative rule set; the bullets below are the FXA-specific triggers to apply on top of it.
+Tell this agent it is a QA engineer who understands FXA's testing patterns and common flakiness causes. Tell it to read `.claude/rules/testing/base.md` before reviewing (and `.claude/rules/testing/react.md` if reviewing any `*.test.tsx`) — those rules are the authoritative rule set; the bullets below are the FXA-specific triggers to apply on top of them.
 
 - Check new auth-server source files have co-located `*.spec.ts`; fxa-settings uses `*.test.tsx` convention
 - Flag new Mocha tests — all new tests must be Jest
@@ -125,8 +125,8 @@ Tell this agent it is a QA engineer who understands FXA's testing patterns and c
 - Flag `forEach` / `for` loops with `expect` calls inside a single `it()` body — use `it.each` so each case is named and independently reported
 - Flag committed `it.only`, `describe.only`, `fit`, `fdescribe` (silently skips the rest of the suite — CRITICAL/HIGH if it lands in CI). For `it.skip`, `xit`, `xdescribe`, `it.todo`: only flag when there is no adjacent comment explaining the skip — a justified skip with a Jira/issue reference is acceptable per the guidelines
 - Flag when there is a jira ticket number in a test name. We do not put tickets in test names.
-- For `*.test.tsx` files in `fxa-settings` and other React packages: also apply the React testing patterns in `.claude/skills/fxa-check-react/SKILL.md` Section 5 (queries by role, `userEvent` over `fireEvent`, no asserting on instances/refs, snapshot scope).
-- **Layer / shift-left:** flag route handler tests or React component tests that branch-cover business logic via repeated fixtures (more than ~3 tests differing only in input shape). The right fix is to extract the rule into a pure function/service and unit-test it directly; route/component tests should cover wiring (auth, request/response shape, error propagation, rendering), not internal business branching. See "Test layers and shift-left" in `.claude/skills/fxa-testing-shared/GUIDELINES.md`.
+- For `*.test.tsx` files in `fxa-settings` and other React packages: also apply the React testing rules in `.claude/rules/testing/react.md` (queries by role, `userEvent` over `fireEvent`, no asserting on instances/refs, snapshot scope).
+- **Layer / shift-left (golden goal):** flag route handler tests or React component tests that branch-cover business logic via repeated fixtures (more than ~3 tests differing only in input shape). The preferred fix is to extract the rule into a pure function/service and unit-test it directly; route/component tests should cover wiring (auth, request/response shape, error propagation, rendering), not internal business branching. Shifting left isn't required for every change, but always flag the opportunity. See "Shift-left is a golden goal" in `CLAUDE.md` Section 8.
 
 Output JSON array with fields: severity, category ("Test Quality"), subcategory, file, line, issue, recommendation.