Skip to content

Conversation

@mldangelo
Copy link
Member

Summary

Follow-up improvements to PR #7157 (policy plugin display IDs):

Before (alphabetical sorting)

policy #1: "One"
policy #10: "Ten"
policy #12: "Twelve"
policy #2: "Two"

After (numeric sorting)

policy #1: "One"
policy #2: "Two"
policy #10: "Ten"
policy #12: "Twelve"

Test plan

  • All 28 new unit tests pass
  • Existing redteam/index.test.ts tests pass (108 tests)
  • Lint and format checks pass

🤖 Generated with Claude Code

mldangelo and others added 2 commits January 20, 2026 22:31
- Add extractSortKey() to parse language prefix/suffix and policy number
- Add sortReportIds() for numeric policy sorting (1,2,3... not 1,10,11,12,2...)
- Sort by: policy number (numeric) → base ID → language
- Add test verifying numeric sort order with 12 policies

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move report generation functions (getStatus, extractSortKey,
  sortReportIds, generateReport) to new src/redteam/report.ts
- Move plugin display functions (getPluginSeverity,
  getPluginBaseDisplayId) for better separation of concerns
- Add comprehensive unit tests for all extracted functions
- Improves testability and code organization

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@use-tusk
Copy link
Contributor

use-tusk bot commented Jan 21, 2026

⏩ No test execution environment matched (d2e0323) View output ↗


View check history

Commit Status Output Created (UTC)
ece40e6 ⏩ No test execution environment matched Output Jan 21, 2026 3:34AM
d2e0323 ⏩ No test execution environment matched Output Jan 21, 2026 3:48AM

View output in GitHub ↗

@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

🔄 Security Review ✅

No issues in new changes.

The incremental changes add afterEach(() => vi.resetAllMocks()) to two test files, improving test isolation and preventing mock leakage between tests. This is a good practice that addresses previous reviewer feedback.

PR Convention Check

✅ PR touches src/redteam/ and correctly uses (redteam) scope in title per THE REDTEAM RULE.


Last updated: 2026-01-21T03:50:00Z | Reviewing: d2e0323 (incremental from ece40e6)

Copy link
Contributor

@promptfoo-scanner promptfoo-scanner bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 All Clear

This PR refactors report generation functions into a dedicated module and adds numeric sorting for policy numbers. No LLM security vulnerabilities were identified in these changes.

Minimum severity threshold for this scan: 🟡 Medium | Learn more


Was this helpful?  👍 Yes  |  👎 No 

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

This pull request refactors report-generation utilities by extracting internal helper functions from src/redteam/index.ts into a new src/redteam/report.ts module. The extracted functions include getPluginSeverity, getPluginBaseDisplayId, getStatus, and generateReport. New sorting utilities (extractSortKey and sortReportIds) are introduced to enable deterministic report ordering, particularly for numeric policy number sorting. The index.ts file is updated to import these utilities from the new report module. Test coverage is added to verify the report generation, sorting, and status formatting logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the two main changes: extracting report functions to a dedicated module and implementing numeric sorting for policy numbers.
Description check ✅ Passed The description is directly related to the changeset, providing clear context about the refactoring, numeric sorting improvements, and test coverage with before/after examples.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@test/redteam/index.test.ts`:
- Around line 3068-3107: Add a top-level afterEach cleanup inside the
describe('synthesize') block to reset mocks after each test: locate the
describe('synthesize') containing the it(...) tests (including the 'should sort
report rows...' test) and insert afterEach(() => vi.resetAllMocks()) so all vi
mocks/spies are cleared between tests (ensuring Plugins.find, logger.info, and
other mocks are reset).

In `@test/redteam/report.test.ts`:
- Around line 1-13: Add explicit vitest imports and a teardown to reset mocks:
update the import line that currently pulls describe, expect, it from 'vitest'
to also import afterEach and vi, then add a top-level afterEach(() => {
vi.resetAllMocks(); }) to the test file so that mocks are reset between tests;
this change affects the test helpers (stripAnsi) and tests that call functions
like extractSortKey, generateReport, getPluginBaseDisplayId, getPluginSeverity,
getStatus, and sortReportIds.
🧹 Nitpick comments (1)
src/redteam/report.ts (1)

100-123: Verify ordering between policy and non‑policy IDs.

Current logic pushes all policy rows ahead of non-policy rows (because 0 is treated as “non-policy”). If you want to keep overall alphabetical ordering while still sorting policies numerically, consider only applying numeric comparison when both IDs are policy-numbered.

♻️ Optional adjustment to keep non-policy alphabetical order
 export function sortReportIds(a: string, b: string): number {
   const [aNum, aLang, aBase] = extractSortKey(a);
   const [bNum, bLang, bBase] = extractSortKey(b);

-  // First sort by policy number (0 means no policy number, goes last among policies)
-  if (aNum !== bNum) {
-    if (aNum === 0) {
-      return 1;
-    }
-    if (bNum === 0) {
-      return -1;
-    }
-    return aNum - bNum;
-  }
+  // Only apply numeric ordering when both are policy-numbered
+  if (aNum > 0 && bNum > 0 && aNum !== bNum) {
+    return aNum - bNum;
+  }

Comment on lines +3068 to +3107
it('should sort report rows numerically by policy number, not alphabetically', async () => {
const mockPluginAction = vi.fn().mockResolvedValue([{ vars: { query: 'test' } }]);
vi.spyOn(Plugins, 'find').mockReturnValue({
action: mockPluginAction,
key: 'policy',
});

// Create 12 policy plugins to test numeric sorting (1, 2, 3... vs 1, 10, 11, 12, 2...)
const plugins = Array.from({ length: 12 }, (_, i) => ({
id: 'policy',
numTests: 1,
config: { policy: `Policy ${i + 1}` },
}));

await synthesize({
numTests: 1,
plugins,
prompts: ['Test prompt'],
strategies: [],
targetIds: ['test-provider'],
});

const reportMessage = vi
.mocked(logger.info)
.mock.calls.map(([arg]) => arg)
.find(
(arg): arg is string => typeof arg === 'string' && arg.includes('Test Generation Report'),
);

expect(reportMessage).toBeDefined();
const cleanReport = stripAnsi(reportMessage || '');

// Extract the order of policy numbers from the report
const policyMatches = cleanReport.match(/policy #(\d+)/g) || [];
const policyNumbers = policyMatches.map((m) => parseInt(m.replace('policy #', ''), 10));

// Should be sorted numerically: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
// NOT alphabetically: 1, 10, 11, 12, 2, 3, 4, 5, 6, 7, 8, 9
expect(policyNumbers).toEqual([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check the overall structure of the test file and search for existing afterEach calls
rg -n "afterEach|describe\(" test/redteam/index.test.ts | head -20

Repository: promptfoo/promptfoo

Length of output: 1020


🏁 Script executed:

# Also check specifically around the lines mentioned in the review (3068-3107) and nearby describe blocks
sed -n '3000,3150p' test/redteam/index.test.ts | grep -n "describe\|afterEach\|it(" | head -30

Repository: promptfoo/promptfoo

Length of output: 256


🏁 Script executed:

# Check if the top-level describe('synthesize') has an afterEach cleanup
# First, find where the top-level 'synthesize' describe starts and where nested describes are
sed -n '58,100p' test/redteam/index.test.ts

Repository: promptfoo/promptfoo

Length of output: 1383


🏁 Script executed:

# Check for afterEach within the synthesize describe block (which spans many lines)
# Let's search for afterEach specifically in the context of the synthesize describe
rg -B2 -A2 "afterEach" test/redteam/index.test.ts | head -40

Repository: promptfoo/promptfoo

Length of output: 374


🏁 Script executed:

# Let's see the full scope and location of the existing afterEach
# Find the afterEach and check its nesting level and which describe block it belongs to
rg -n "afterEach" test/redteam/index.test.ts -B5 -A5

Repository: promptfoo/promptfoo

Length of output: 937


🏁 Script executed:

# Also check the line numbers around the test mentioned (3068-3107) and see which describe it's in
sed -n '3050,3120p' test/redteam/index.test.ts

Repository: promptfoo/promptfoo

Length of output: 2451


🏁 Script executed:

# Let's confirm which describe block contains the test at lines 3068-3107
# by checking what's nested at that location
sed -n '2460,2500p' test/redteam/index.test.ts | grep -E "describe|^\s*\};"

Repository: promptfoo/promptfoo

Length of output: 113


🏁 Script executed:

# Count the indentation to understand nesting around line 3068
sed -n '3040,3110p' test/redteam/index.test.ts | head -70

Repository: promptfoo/promptfoo

Length of output: 2775


🏁 Script executed:

# Find the exact line numbers and content of afterEach in the top-level synthesize describe
sed -n '58,130p' test/redteam/index.test.ts | grep -A2 -B2 "afterEach"

Repository: promptfoo/promptfoo

Length of output: 45


🏁 Script executed:

# Verify the overall nesting - check where describe('synthesize' ends
# by looking for the closing brace at the appropriate indentation level
tail -50 test/redteam/index.test.ts

Repository: promptfoo/promptfoo

Length of output: 2007


🏁 Script executed:

# Check the entire synthesize describe block structure more carefully
# Look for afterEach between line 58 and where nested describes start
awk '/^describe\(\x27synthesize\x27/,/^describe\(|^})\)/ {print NR": "$0}' test/redteam/index.test.ts | head -100

Repository: promptfoo/promptfoo

Length of output: 97


🏁 Script executed:

# Simpler approach: get line numbers for all describe blocks and afterEach in the file
grep -n "^describe\|^  afterEach\|^  beforeEach\|^  afterAll" test/redteam/index.test.ts | head -30

Repository: promptfoo/promptfoo

Length of output: 395


🏁 Script executed:

# Let's verify the exact structure: look at what happens after line 69 to see if there's an afterEach in the top-level synthesize
sed -n '58,120p' test/redteam/index.test.ts

Repository: promptfoo/promptfoo

Length of output: 2062


🏁 Script executed:

# Continue from line 120 to find if there's an afterEach in the top-level synthesize describe
sed -n '110,200p' test/redteam/index.test.ts

Repository: promptfoo/promptfoo

Length of output: 2898


🏁 Script executed:

# Definitive check: grep for afterEach in the entire file and show context
grep -n "afterEach" test/redteam/index.test.ts

Repository: promptfoo/promptfoo

Length of output: 173


🏁 Script executed:

# Also check the structure right after the beforeEach ends to confirm no afterEach at top level
sed -n '100,130p' test/redteam/index.test.ts | cat -n

Repository: promptfoo/promptfoo

Length of output: 1355


Add required afterEach mock cleanup to top-level describe('synthesize').

The test file lacks the mandated afterEach(() => vi.resetAllMocks()) at the top-level describe scope. Per coding guidelines, all tests must clean up mocks after each test.

Suggested fix
describe('synthesize', () => {
  const mockProvider = {
    callApi: vi.fn(),
    generate: vi.fn(),
    id: () => 'test-provider',
  };

  afterAll(() => {
    vi.restoreAllMocks();
  });

  beforeEach(() => {
    vi.clearAllMocks();
    vi.resetAllMocks();
    // ... rest of setup
  });

+  afterEach(() => {
+    vi.resetAllMocks();
+  });
🤖 Prompt for AI Agents
In `@test/redteam/index.test.ts` around lines 3068 - 3107, Add a top-level
afterEach cleanup inside the describe('synthesize') block to reset mocks after
each test: locate the describe('synthesize') containing the it(...) tests
(including the 'should sort report rows...' test) and insert afterEach(() =>
vi.resetAllMocks()) so all vi mocks/spies are cleared between tests (ensuring
Plugins.find, logger.info, and other mocks are reset).

Comment on lines 1 to 13
import { describe, expect, it } from 'vitest';
import { Severity } from '../../src/redteam/constants';
import {
extractSortKey,
generateReport,
getPluginBaseDisplayId,
getPluginSeverity,
getStatus,
sortReportIds,
} from '../../src/redteam/report';

// Strip ANSI codes for easier testing
const stripAnsi = (str: string) => str.replace(/\x1B\[[0-9;]*[a-zA-Z]/g, '');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check vitest configuration
find . -name "vitest.config.*" -o -name "vitest.setup.*" | head -20

Repository: promptfoo/promptfoo

Length of output: 168


🏁 Script executed:

# Review existing test files in test/ directory to see current patterns
fd -e test.ts$ test/ | head -10 | xargs -I {} head -15 {}

Repository: promptfoo/promptfoo

Length of output: 388


🏁 Script executed:

# Check for any vitest globals configuration in package.json or tsconfig
fd -e json . -path "*/package.json" -o -path "*/vitest*" | head -10 | xargs -I {} grep -l "globals\|vitest" {}

Repository: promptfoo/promptfoo

Length of output: 231


🏁 Script executed:

# Read main vitest config
cat vitest.config.ts

Repository: promptfoo/promptfoo

Length of output: 1763


🏁 Script executed:

# Read vitest setup
cat vitest.setup.ts

Repository: promptfoo/promptfoo

Length of output: 1796


🏁 Script executed:

# Check existing test files in test/ directory for import patterns
fd . test/ --extension test.ts | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && head -20 {}'

Repository: promptfoo/promptfoo

Length of output: 3666


🏁 Script executed:

# Read the full report.test.ts file to see if vi or mocks are used
wc -l test/redteam/report.test.ts && cat test/redteam/report.test.ts

Repository: promptfoo/promptfoo

Length of output: 9119


Add explicit vitest imports and cleanup as required by test guidelines.

Tests must import afterEach and vi from vitest and run vi.resetAllMocks() in afterEach to maintain consistency across the test suite, even when mocks aren't currently used.

Suggested fix
-import { describe, expect, it } from 'vitest';
+import { afterEach, describe, expect, it, vi } from 'vitest';
 import { Severity } from '../../src/redteam/constants';
 import {
   extractSortKey,
   generateReport,
   getPluginBaseDisplayId,
   getPluginSeverity,
   getStatus,
   sortReportIds,
 } from '../../src/redteam/report';
 
 // Strip ANSI codes for easier testing
 const stripAnsi = (str: string) => str.replace(/\x1B\[[0-9;]*[a-zA-Z]/g, '');
+
+afterEach(() => {
+  vi.resetAllMocks();
+});
🤖 Prompt for AI Agents
In `@test/redteam/report.test.ts` around lines 1 - 13, Add explicit vitest imports
and a teardown to reset mocks: update the import line that currently pulls
describe, expect, it from 'vitest' to also import afterEach and vi, then add a
top-level afterEach(() => { vi.resetAllMocks(); }) to the test file so that
mocks are reset between tests; this change affects the test helpers (stripAnsi)
and tests that call functions like extractSortKey, generateReport,
getPluginBaseDisplayId, getPluginSeverity, getStatus, and sortReportIds.

Address CodeRabbit review feedback:
- Add afterEach(() => vi.resetAllMocks()) to report.test.ts
- Add afterEach mock cleanup to synthesize describe block in index.test.ts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants