Skip to content

test(chat): add e2e tests for delete all chats button visibility#1113

Open
fbricon wants to merge 2 commits intokortex-hub:mainfrom
fbricon:GH-477-e2e-tests
Open

test(chat): add e2e tests for delete all chats button visibility#1113
fbricon wants to merge 2 commits intokortex-hub:mainfrom
fbricon:GH-477-e2e-tests

Conversation

@fbricon
Copy link
Contributor

@fbricon fbricon commented Mar 17, 2026

Separate e2e tests for "delete all chats" button visibility (following up on #1106), as requested by @serbangeorge-m.

Includes #1106 commits to keep the PR testable locally. rebased against main, that already contains #1106

@fbricon fbricon requested a review from a team as a code owner March 17, 2026 12:14
@fbricon fbricon requested review from bmahabirbu and gastoner and removed request for a team March 17, 2026 12:14
@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Playwright chat tests and page model updates: added getChatModelIndices, tightened sendMessage typing, expanded chat creation/deletion and model-selection flows, adjusted timeouts/visibility checks, and added high-count chat viewport assertions and related test scaffolding.

Changes

Cohort / File(s) Summary
Chat smoke tests
tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts
Expanded smoke tests (CHAT-03..CHAT-11): stronger preconditions for chat deletion, ensure minimum chat count, create many chats for viewport assertions, iterate model selection via indices, update timeouts/visibility expectations, assert chat-view reset and delete-all button visibility/in-viewport state.
Chat page model
tests/playwright/src/model/pages/chat-page.ts
Added getChatModelIndices() to enumerate non-embed models from the model dropdown. Tightened/annotated method signatures (typed options for sendMessage, explicit numeric timeout params) and adjusted timeout defaults for several verification helpers.

Sequence Diagram(s)

(Skipped — changes are test/helpers-focused and do not introduce a new multi-component runtime feature requiring sequence visualization.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and concisely describes the main change: adding e2e tests for delete all chats button visibility, which aligns with the changeset's focus on chat test modifications and new test coverage.
Description check ✅ Passed The description directly relates to the changeset by explaining the purpose of the new tests as a follow-up to #1106 and clarifying the rebase against main, which matches the test additions and chat page modifications evident in the files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/renderer/src/lib/chat/components/sidebar-history/history.svelte`:
- Around line 112-115: The UI currently resets state (currentChatId.value =
undefined) and navigates (router.goto('/')) before the delete-all promise
resolves; move those two lines so they execute only after the delete operation
completes successfully — e.g., await the deleteAll() call or place the lines in
the .then() handler of the delete-all promise tied to the delete action, and
keep any error handling/path for the rejected promise to avoid resetting state
on failure; reference currentChatId.value and router.goto('/') when making this
change.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts`:
- Around line 300-317: The test is over-coupled to model latency and asserts an
absolute count; change it to record the initial chat count via
chatPage.getChatHistoryCount(), create a much smaller number of new chats (e.g.
2–3) using chatPage.clickNewChat(), send messages with chatPage.sendMessage()
but avoid waiting for 20 full model responses (reduce or remove the expect.poll
against TIMEOUTS.MODEL_RESPONSE for each message), and finally assert that
chatPage.getChatHistoryCount() increased by the number of chats you created
(compare newCount >= initialCount + chatsCreated) instead of >= chatsToCreate;
keep references to chatPage.verifyModelConversationMessage only where necessary
for a single smoke response check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f69ce754-4ab0-4fe8-84a1-e99c08953070

📥 Commits

Reviewing files that changed from the base of the PR and between 5b6bbf8 and e430375.

📒 Files selected for processing (2)
  • packages/renderer/src/lib/chat/components/sidebar-history/history.svelte
  • tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts

Comment on lines +112 to +115
if (chatId) {
currentChatId.value = undefined;
router.goto('/');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Move route/state reset to successful delete completion.

currentChatId.value = undefined and router.goto('/') run before the delete-all promise resolves, so UI resets even on backend failure. This should run only after successful deletion.

💡 Proposed fix
 async function handleDeleteAllChats(): Promise<void> {
   const deleteAllPromise = window.inferenceDeleteAllChats();

   toast.promise(deleteAllPromise, {
     loading: 'Deleting all chats...',
     success: () => {
       chatHistory.refetch().catch((err: unknown) => {
         console.error('Failed to refetch chat history', err);
       });
+      if (chatId) {
+        currentChatId.value = undefined;
+        router.goto('/');
+      }
       return 'All chats deleted successfully';
     },
     error: 'Failed to delete all chats',
   });
-
-  if (chatId) {
-    currentChatId.value = undefined;
-    router.goto('/');
-  }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (chatId) {
currentChatId.value = undefined;
router.goto('/');
}
async function handleDeleteAllChats(): Promise<void> {
const deleteAllPromise = window.inferenceDeleteAllChats();
toast.promise(deleteAllPromise, {
loading: 'Deleting all chats...',
success: () => {
chatHistory.refetch().catch((err: unknown) => {
console.error('Failed to refetch chat history', err);
});
if (chatId) {
currentChatId.value = undefined;
router.goto('/');
}
return 'All chats deleted successfully';
},
error: 'Failed to delete all chats',
});
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/renderer/src/lib/chat/components/sidebar-history/history.svelte`
around lines 112 - 115, The UI currently resets state (currentChatId.value =
undefined) and navigates (router.goto('/')) before the delete-all promise
resolves; move those two lines so they execute only after the delete operation
completes successfully — e.g., await the deleteAll() call or place the lines in
the .then() handler of the delete-all promise tied to the delete action, and
keep any error handling/path for the rejected promise to avoid resetting state
on failure; reference currentChatId.value and router.goto('/') when making this
change.

@codecov
Copy link

codecov bot commented Mar 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
packages/renderer/src/lib/chat/components/sidebar-history/history.svelte (1)

112-115: ⚠️ Potential issue | 🟠 Major

Move route/store reset to delete-all success path.

At Line 112-Line 115, currentChatId.value reset and router.goto('/') still run before delete-all completion. If backend deletion fails, UI/state is still reset.

Suggested fix
 async function handleDeleteAllChats(): Promise<void> {
   const deleteAllPromise = window.inferenceDeleteAllChats();

   toast.promise(deleteAllPromise, {
     loading: 'Deleting all chats...',
     success: () => {
       chatHistory.refetch().catch((err: unknown) => {
         console.error('Failed to refetch chat history', err);
       });
+      if (chatId) {
+        currentChatId.value = undefined;
+        router.goto('/');
+      }
       return 'All chats deleted successfully';
     },
     error: 'Failed to delete all chats',
   });
-
-  if (chatId) {
-    currentChatId.value = undefined;
-    router.goto('/');
-  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/renderer/src/lib/chat/components/sidebar-history/history.svelte`
around lines 112 - 115, The reset of currentChatId.value and router.goto('/') is
happening immediately instead of after the backend deletion completes; move
those two calls into the successful completion branch/then-block of the
delete-all operation (e.g., inside the promise resolution or success callback
for the function that performs the deletion such as deleteAllChats /
handleDeleteAll), and on failure keep state as-is and surface an error (e.g.,
via a toast/logger) so UI/state are not reset if the backend delete fails.
tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts (1)

300-317: ⚠️ Potential issue | 🟠 Major

CHAT-10 is still over-coupled to model latency and uses a weak absolute count assertion.

At Line 301-Line 317, creating 20 chats with per-chat model-response polling makes this smoke test slow/flaky, and >= chatsToCreate can pass from pre-existing history.

Suggested fix
 test('[CHAT-10] Delete all button remains visible without scrolling', async ({ chatPage }) => {
   await chatPage.ensureChatSidebarVisible();
+  const initialCount = await chatPage.getChatHistoryCount();

   // Create multiple chats to test scrolling scenario
-  const chatsToCreate = 20;
+  const chatsToCreate = 3;
   for (let i = 0; i < chatsToCreate; i++) {
     await chatPage.clickNewChat();
-    const expectedResponse = `RESPONSE ${i + 1}`;
-    await chatPage.sendMessage(`Reply "${expectedResponse}", nothing else.`, 100);
-    await expect
-      .poll(async () => await chatPage.verifyModelConversationMessage(expectedResponse), {
-        timeout: TIMEOUTS.MODEL_RESPONSE,
-        message: 'Model should respond with the expected response',
-      })
-      .toBeTruthy();
+    const message = `Layout visibility seed ${i + 1}`;
+    await chatPage.sendMessage(message, 0);
+    await chatPage.verifyConversationMessage(message);
   }

   // Wait for all chats to be created
   await expect
     .poll(async () => await chatPage.getChatHistoryCount(), { timeout: TIMEOUTS.MODEL_RESPONSE })
-    .toBeGreaterThanOrEqual(chatsToCreate);
+    .toBeGreaterThanOrEqual(initialCount + chatsToCreate);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts` around lines
300 - 317, The test is flaky and slow because it waits for model responses on
every chat and asserts an absolute history count; change to capture a baseline
via chatPage.getChatHistoryCount() once, then create chats without per-chat
MODEL_RESPONSE polling (use a lightweight UI-ready check like
chatPage.hasConversationItem or mock/stub the model response where possible) and
finally poll for chatPage.getChatHistoryCount() to be >= baseline +
chatsToCreate using TIMEOUTS.MODEL_RESPONSE; reference the methods clickNewChat,
sendMessage, verifyModelConversationMessage (remove/replace per-chat), and
getChatHistoryCount when implementing this.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/renderer/src/lib/chat/components/sidebar-history/history.svelte`:
- Around line 112-115: The reset of currentChatId.value and router.goto('/') is
happening immediately instead of after the backend deletion completes; move
those two calls into the successful completion branch/then-block of the
delete-all operation (e.g., inside the promise resolution or success callback
for the function that performs the deletion such as deleteAllChats /
handleDeleteAll), and on failure keep state as-is and surface an error (e.g.,
via a toast/logger) so UI/state are not reset if the backend delete fails.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts`:
- Around line 300-317: The test is flaky and slow because it waits for model
responses on every chat and asserts an absolute history count; change to capture
a baseline via chatPage.getChatHistoryCount() once, then create chats without
per-chat MODEL_RESPONSE polling (use a lightweight UI-ready check like
chatPage.hasConversationItem or mock/stub the model response where possible) and
finally poll for chatPage.getChatHistoryCount() to be >= baseline +
chatsToCreate using TIMEOUTS.MODEL_RESPONSE; reference the methods clickNewChat,
sendMessage, verifyModelConversationMessage (remove/replace per-chat), and
getChatHistoryCount when implementing this.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3531697-139f-4104-b037-3d09e036e99c

📥 Commits

Reviewing files that changed from the base of the PR and between e430375 and 0e6d9b8.

📒 Files selected for processing (2)
  • packages/renderer/src/lib/chat/components/sidebar-history/history.svelte
  • tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts

@gastoner gastoner requested a review from serbangeorge-m March 18, 2026 06:19
@fbricon fbricon force-pushed the GH-477-e2e-tests branch 2 times, most recently from b21a0c2 to 3dc7620 Compare March 18, 2026 13:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts (1)

303-320: ⚠️ Potential issue | 🟠 Major

CHAT-10 remains over-coupled to model latency and uses a weak absolute count assertion.

This is still creating 20 model-dependent chats and asserting >= chatsToCreate instead of growth from baseline. It can be slow/flaky and can pass even when this test adds fewer chats than intended.

💡 Focused fix
       await chatPage.ensureChatSidebarVisible();
+      const initialCount = await chatPage.getChatHistoryCount();

       // Create multiple chats to test scrolling scenario
-      const chatsToCreate = 20;
+      const chatsToCreate = 3;
       for (let i = 0; i < chatsToCreate; i++) {
         await chatPage.clickNewChat();
-        const expectedResponse = `RESPONSE ${i + 1}`;
-        await chatPage.sendMessage(`Reply "${expectedResponse}", nothing else.`, { timeout: 100 });
-        await expect
-          .poll(async () => await chatPage.verifyModelConversationMessage(expectedResponse), {
-            timeout: TIMEOUTS.MODEL_RESPONSE,
-            message: 'Model should respond with the expected response',
-          })
-          .toBeTruthy();
+        const message = `Layout visibility seed ${i + 1}`;
+        await chatPage.sendMessage(message, { waitForMessage: false });
+        await chatPage.verifyConversationMessage(message);
       }

       // Wait for all chats to be created
       await expect
         .poll(async () => await chatPage.getChatHistoryCount(), { timeout: TIMEOUTS.MODEL_RESPONSE })
-        .toBeGreaterThanOrEqual(chatsToCreate);
+        .toBeGreaterThanOrEqual(initialCount + chatsToCreate);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts` around lines
303 - 320, Capture the baseline chat count before creating new chats by calling
chatPage.getChatHistoryCount(), then create the chats (using
chatPage.clickNewChat() and chatPage.sendMessage()) but avoid coupling the
overall success to model latency: either stop waiting for the model response on
every iteration or keep the existing verifyModelConversationMessage() only for
content checks while not blocking the count assertion. Finally assert that
chatPage.getChatHistoryCount() is >= baseline + chatsToCreate (not just >=
chatsToCreate). Use the existing symbols getChatHistoryCount, clickNewChat,
sendMessage, and verifyModelConversationMessage to locate and update the logic.
🧹 Nitpick comments (2)
tests/playwright/src/model/pages/chat-page.ts (1)

216-229: Close the model dropdown in a finally block to avoid state leakage.

If any step fails between Line 217 and Line 226, Escape at Line 227 is skipped and the menu can stay open for subsequent steps.

💡 Proposed hardening
  async getChatModelIndices(): Promise<number[]> {
-    await this.modelDropdownSelector.click();
-    await expect(this.modelDropdownContent).toBeVisible();
-    const count = await this.modelMenuItems.count();
-    const indices: number[] = [];
-    for (let i = 0; i < count; i++) {
-      const text = await this.modelMenuItems.nth(i).textContent();
-      if (!text?.toLowerCase().includes('embed')) {
-        indices.push(i);
-      }
-    }
-    await this.page.keyboard.press('Escape');
-    return indices;
+    await this.modelDropdownSelector.click();
+    await expect(this.modelDropdownContent).toBeVisible();
+    try {
+      const count = await this.modelMenuItems.count();
+      const indices: number[] = [];
+      for (let i = 0; i < count; i++) {
+        const text = (await this.modelMenuItems.nth(i).textContent()) ?? '';
+        if (!text.toLowerCase().includes('embed')) {
+          indices.push(i);
+        }
+      }
+      return indices;
+    } finally {
+      await this.page.keyboard.press('Escape');
+    }
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/playwright/src/model/pages/chat-page.ts` around lines 216 - 229, The
model dropdown in getChatModelIndices can remain open if an error occurs before
the final Escape press; wrap the interaction so the Escape key is sent in a
finally block to guarantee the menu closes. Specifically, open the dropdown with
this.modelDropdownSelector.click(), perform the expect/iteration logic on
this.modelDropdownContent and this.modelMenuItems inside a try, collect indices
as currently done, and then always call this.page.keyboard.press('Escape') in a
finally block so the dropdown is closed even on failures.
tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts (1)

216-216: Use a UI-scoped timeout for button visibility checks.

Line 216 uses TIMEOUTS.MODEL_RESPONSE (LLM response budget) for a UI element visibility check, which can hide regressions behind very long waits.

💡 Suggested adjustment
-      await expect(chatPage.toolsSelectionButton).toBeVisible({ timeout: TIMEOUTS.MODEL_RESPONSE });
+      await expect(chatPage.toolsSelectionButton).toBeVisible({ timeout: TIMEOUTS.STANDARD });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts` at line 216,
The visibility check for chatPage.toolsSelectionButton is using
TIMEOUTS.MODEL_RESPONSE (LLM budget) which is too long for UI checks; change the
timeout to a UI-scoped constant (e.g., TIMEOUTS.UI or
TIMEOUTS.UI_INTERACTION/SHORT) in the await expect(...).toBeVisible call so the
test uses a short UI timeout for button visibility, and if that constant doesn’t
exist, add a suitable UI timeout to the TIMEOUTS object and use it in the check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts`:
- Around line 95-96: The implicit 100ms timeout passed to sendMessage is too
aggressive and causes flakes; remove the timeout argument (or set sendMessage to
not use an implicit short wait) so sendMessage uses the default timing, and rely
on verifyConversationMessage to assert presence; update the call to
chatPage.sendMessage(message) and ensure
chatPage.verifyConversationMessage(message) remains as the explicit check
(functions referenced: chatPage.sendMessage and
chatPage.verifyConversationMessage).

---

Duplicate comments:
In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts`:
- Around line 303-320: Capture the baseline chat count before creating new chats
by calling chatPage.getChatHistoryCount(), then create the chats (using
chatPage.clickNewChat() and chatPage.sendMessage()) but avoid coupling the
overall success to model latency: either stop waiting for the model response on
every iteration or keep the existing verifyModelConversationMessage() only for
content checks while not blocking the count assertion. Finally assert that
chatPage.getChatHistoryCount() is >= baseline + chatsToCreate (not just >=
chatsToCreate). Use the existing symbols getChatHistoryCount, clickNewChat,
sendMessage, and verifyModelConversationMessage to locate and update the logic.

---

Nitpick comments:
In `@tests/playwright/src/model/pages/chat-page.ts`:
- Around line 216-229: The model dropdown in getChatModelIndices can remain open
if an error occurs before the final Escape press; wrap the interaction so the
Escape key is sent in a finally block to guarantee the menu closes.
Specifically, open the dropdown with this.modelDropdownSelector.click(), perform
the expect/iteration logic on this.modelDropdownContent and this.modelMenuItems
inside a try, collect indices as currently done, and then always call
this.page.keyboard.press('Escape') in a finally block so the dropdown is closed
even on failures.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts`:
- Line 216: The visibility check for chatPage.toolsSelectionButton is using
TIMEOUTS.MODEL_RESPONSE (LLM budget) which is too long for UI checks; change the
timeout to a UI-scoped constant (e.g., TIMEOUTS.UI or
TIMEOUTS.UI_INTERACTION/SHORT) in the await expect(...).toBeVisible call so the
test uses a short UI timeout for button visibility, and if that constant doesn’t
exist, add a suitable UI timeout to the TIMEOUTS object and use it in the check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 03f7d07d-e5fd-42f2-b88f-2f0e8cd53b3e

📥 Commits

Reviewing files that changed from the base of the PR and between b21a0c2 and 3dc7620.

📒 Files selected for processing (2)
  • tests/playwright/src/model/pages/chat-page.ts
  • tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts (2)

96-97: ⚠️ Potential issue | 🟡 Minor

timeout: 100 is still too aggressive for CI jitter.

At Line 96, this remains a flaky path; keep verifyConversationMessage as the explicit assertion and avoid the 100ms implicit wait.

💡 Proposed fix
-          await chatPage.sendMessage(message, { timeout: 100 });
+          await chatPage.sendMessage(message, { waitForMessage: false });
           await chatPage.verifyConversationMessage(message);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts` around lines 96
- 97, The 100ms implicit timeout on chatPage.sendMessage is too aggressive and
causes CI flakiness; remove the { timeout: 100 } argument from the sendMessage
call so the test relies on the explicit assertion
chatPage.verifyConversationMessage(message) (or replace with a reasonable larger
timeout if an explicit wait is required), ensuring verifyConversationMessage
remains the authoritative check for message delivery.

301-321: ⚠️ Potential issue | 🟠 Major

CHAT-10 is still over-coupled to model latency and uses a weak absolute count assertion.

This test still waits on 20 model responses and checks >= chatsToCreate, which can be slow/flaky and can pass with preexisting chats.

💡 Proposed fix
     test('[CHAT-10] Delete all button remains visible without scrolling', async ({ chatPage }) => {
       await chatPage.ensureChatSidebarVisible();
+      const initialCount = await chatPage.getChatHistoryCount();

       // Create multiple chats to test scrolling scenario
-      const chatsToCreate = 20;
+      const chatsToCreate = 3;
       for (let i = 0; i < chatsToCreate; i++) {
         await chatPage.clickNewChat();
-        const expectedResponse = `RESPONSE ${i + 1}`;
-        await chatPage.sendMessage(`Reply "${expectedResponse}", nothing else.`, { timeout: 100 });
-        await expect
-          .poll(async () => await chatPage.verifyModelConversationMessage(expectedResponse), {
-            timeout: TIMEOUTS.MODEL_RESPONSE,
-            message: 'Model should respond with the expected response',
-          })
-          .toBeTruthy();
+        const message = `Layout visibility seed ${i + 1}`;
+        await chatPage.sendMessage(message, { waitForMessage: false });
+        await chatPage.verifyConversationMessage(message);
       }

       // Wait for all chats to be created
       await expect
         .poll(async () => await chatPage.getChatHistoryCount(), { timeout: TIMEOUTS.MODEL_RESPONSE })
-        .toBeGreaterThanOrEqual(chatsToCreate);
+        .toBeGreaterThanOrEqual(initialCount + chatsToCreate);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts` around lines
301 - 321, The test is over-coupled to model latency by waiting for 20 model
responses and uses an absolute count that can include preexisting chats; fix it
by first reading the initial chat count via chatPage.getChatHistoryCount(),
reduce chatsToCreate to a small deterministic number (e.g., 3) or keep it
configurable, create chats by calling chatPage.clickNewChat() (and avoid sending
messages or waiting on verifyModelConversationMessage), then poll
getChatHistoryCount() until it equals initialCount + chatsToCreate (use the
existing TIMEOUTS.MODEL_RESPONSE for the poll) and assert that increase;
reference the test '[CHAT-10] Delete all button remains visible without
scrolling', chatPage.getChatHistoryCount, chatPage.clickNewChat,
chatPage.sendMessage, and chatPage.verifyModelConversationMessage when making
these edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/playwright/src/model/pages/chat-page.ts`:
- Around line 216-229: getChatModelIndices currently pushes indices when text
may be null and only presses Escape at the end; modify getChatModelIndices to
explicitly guard against empty/null menu text by reading text via
modelMenuItems.nth(i).textContent() and only push i when text is truthy and
text.toLowerCase() does not include('embed'), and ensure the dropdown is always
closed by moving page.keyboard.press('Escape') into a finally block that runs
after the loop (so use try { ... } finally { await
this.page.keyboard.press('Escape'); }) while keeping interactions with
modelDropdownSelector and modelDropdownContent unchanged.
- Line 180: Remove the inferrable "number" type annotations from the timeout
parameters because TypeScript can infer their types from the default values;
update the signatures for waitForChatHistoryCount(timeout = 10_000),
verifyConversationMessage(timeout = TIMEOUTS.SHORT),
verifySendButtonVisible(timeout = 10_000), verifySendButtonHidden(timeout =
10_000), verifyStopButtonVisible(timeout = 10_000), and
verifyStopButtonHidden(timeout = 10_000) by deleting the ": number" portions
while leaving defaults and function bodies unchanged so they conform to the
noInferrableTypes rule.

---

Duplicate comments:
In `@tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts`:
- Around line 96-97: The 100ms implicit timeout on chatPage.sendMessage is too
aggressive and causes CI flakiness; remove the { timeout: 100 } argument from
the sendMessage call so the test relies on the explicit assertion
chatPage.verifyConversationMessage(message) (or replace with a reasonable larger
timeout if an explicit wait is required), ensuring verifyConversationMessage
remains the authoritative check for message delivery.
- Around line 301-321: The test is over-coupled to model latency by waiting for
20 model responses and uses an absolute count that can include preexisting
chats; fix it by first reading the initial chat count via
chatPage.getChatHistoryCount(), reduce chatsToCreate to a small deterministic
number (e.g., 3) or keep it configurable, create chats by calling
chatPage.clickNewChat() (and avoid sending messages or waiting on
verifyModelConversationMessage), then poll getChatHistoryCount() until it equals
initialCount + chatsToCreate (use the existing TIMEOUTS.MODEL_RESPONSE for the
poll) and assert that increase; reference the test '[CHAT-10] Delete all button
remains visible without scrolling', chatPage.getChatHistoryCount,
chatPage.clickNewChat, chatPage.sendMessage, and
chatPage.verifyModelConversationMessage when making these edits.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4c3c76e2-7fc3-402d-b5a8-bef04b91ee2b

📥 Commits

Reviewing files that changed from the base of the PR and between 3dc7620 and 78e7a95.

📒 Files selected for processing (2)
  • tests/playwright/src/model/pages/chat-page.ts
  • tests/playwright/src/specs/provider-specs/chat-smoke.spec.ts

fbricon and others added 2 commits March 18, 2026 16:14
… destructured parameters

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fred Bricon <fbricon@gmail.com>
Add tests to verify the delete all chats button visibility states:
- Button appears when chats exist and hides when all are deleted
- Button remains visible in viewport without scrolling with many chats
- Clean up existing chats before multi-session test to avoid flakiness

Also fix model selection test ordering and timeout adjustments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fred Bricon <fbricon@gmail.com>
@benoitf
Copy link
Contributor

benoitf commented Mar 19, 2026

PTAL @serbangeorge-m

@serbangeorge-m
Copy link
Contributor

@fbricon The getChatModelIndices() method and the "embed" filtering don't solve an actual problem. The actually meaningful changes in this PR are: CHAT-04 and CHAT-10.

return count;
}

async getChatModelIndices(): Promise<number[]> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why we need this. What issue is trying to solve? we have already a method handling the dropdown used in multiple tests

Copy link
Contributor Author

@fbricon fbricon Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because on my Ollama machine, I have a bunch of embedding models, that do not support chat and fail when called. Of course this doesn't happen on the CI, but I need this to be able to run it locally. This wouldn't be an issue if model selection was deterministic. If a given set of models is a prerequisite for running the tests, then we could select those and only those in the tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests are design to detect ollama models only when OLLAMA_ENABLED is set true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I run them locally with OLLAMA_ENABLED=true pnpm run test:e2e --grep "CHAT-", and my embedding model is sometimes picked up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a local env issue, not a test issue. The tests are designed for CI where only the correct model(granite3.2:2b) is provisioned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code like this shows the tests adapt to the environment they're running on:

const modelCount = await chatPage.getAvailableModelsCount();
if (modelCount < 2) {
test.skip(true, 'Skipping test: Less than 2 models available');
return;
}

If we can't run the e2e tests locally, then there's no incentive to write them, as the feedback loop waiting for CI to run is way too long.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serbangeorge-m I think it would be great if tests, especially smoke tests, can be more robust when running locally.

We're separating configuration files for testing but here it's more about the ollama instance. From the test POV I would say we don't care if there are already existing models or not, we just want to be sure it's using our models

I know that for containers, for example, sometimes we don't care on all containers that we might have already on a podman instance, we check that if we pull an image, this image is there and that if we start the container, it's there.

so what would you recommend to make the test more robust on running within an existing ollama instance (which of courses can contains things)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benoitf The tests are running ok locally. Rather than tweaking the tests to skip embedding models(which is not a problem at all in the test infrastructure) it would be better to fix this at the product level by filtering the model dropdown based on capabilities. Chat interface should only show models that actually support chat, no?

Copy link
Contributor Author

@fbricon fbricon Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes, but there's no surefire way to do it, except querying all models one by one, to see if they maybe contain a chat template, so it's gonna slow things down for sure. Even the Ollama client doesn't do it.

Screenshot 2026-03-19 at 17 36 21

Checking for embed in the model name is a heuristic good enough for tests, but I wouldn't trust it for user facing selection.

We can certainly look into it but I think this goes beyond this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fbricon I am ok with the embed filtering in this case but we should really avoid future test adaptations for different custom local environments. Also I think it would be better to return the model names instead of indices to avoid a potential race condition between the dropdown interactions, could you add a selectModelByName() method and use that instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants