RDoc-3843_taskErrors - document new task errors views and functionality by reebhub · Pull Request #2450 · ravendb/docs

reebhub · 2026-05-19T01:54:23Z

Type of change

Changes in docs URLs

No changes in docs URLs
Articles are restructured, URLs will change, mapping is required (update /scripts/redirects.json file, set Documents Moved PR label)

Changes in UX/UI

No changes in UX/UI
Changes in UX/UI (include screenshots and description)

…nerating embeddings)

Lwiel · 2026-05-20T15:02:07Z

+* **Persistence** (AI tasks only)  
+  The task could not save its results back to the database. Typical causes include write
+  conflicts or storage errors.  


It also occurs when we fail to update process state, so it's not AI tasks only

Lwiel · 2026-05-20T15:04:14Z

+The retention is per task and per table, so a single noisy task cannot push errors out of an
+unrelated task. The cap is not configurable.  
+
+Errors are also included in the server's debug package as `etl.errors.json`, so support


AI task errors are stored separately

Lwiel · 2026-05-20T15:06:09Z

+
+A task recovers automatically as new batches complete. The health state transitions from
+`Failed` back to `Impaired`, and from `Impaired` back to `Healthy`, as the running error rate
+falls below each threshold. There is no manual "reset" action.  


Maybe it's worth noting we reset health state back to Healthy on task configuration update

Lwiel · 2026-05-20T15:07:25Z

+  `GET /databases/*/tasks/errors` returns errors across all ETL and AI tasks.  
+  `GET /databases/*/etl/errors` and `GET /databases/*/ai/errors` return errors per category.  
+  `DELETE` variants of each path remove errors in bulk, optionally filtered by task name or
+  category. For example, `DELETE /databases/*/etl/errors?name=<task-name>` clears the errors
+  of one specific ETL task.  
+  `POST /databases/*/etl/retry-batch` forces an immediate retry of an ETL task currently in
+  fallback mode.  
+  See [Debug Endpoints](../../server/troubleshooting/debug-routes.mdx#debug-endpoints) for the full reference.  


Can we make it a list?

Lwiel · 2026-05-20T15:12:38Z

+
+<ContentFrame>
+
+### Task health indicators


Maybe let's mention that only node the task is currently on and nodes that contain any errors are displayed here

…ing-tasks/general-info) to use it as a reference from task errors pages

…sks view

Lwiel · 2026-05-22T11:16:56Z

+* Retention is per task and per table, so a single noisy task cannot push errors out of
+  an unrelated task.  
+
+* Errors are also included in the server's debug package as `etl.errors.json`, so


There's separate json file with AI tasks errors

…I) for debug package

arekpalinski · 2026-05-25T12:24:47Z

+* **Item error**  
+  An error that occurred while processing a single document. The document was skipped and the
+  task moved on to the remaining documents in the batch. The error record includes the
+  document ID.  


It's worth adding that an item error on a given document makes that the doc is skipped and the process continues to move forward.

arekpalinski · 2026-05-25T12:25:50Z

+* **Process error**  
+  An error that occurred while processing a batch as a whole and may affect multiple documents,
+  such as a failure to send the batch to its destination. The error record includes the number
+  of documents the failing batch attempted to handle.  


A process error from the other side makes that the process enters will continue to retry the batch until it succeeds (with a fallback strategy)

arekpalinski · 2026-05-25T12:26:35Z

+
+* **Persistence**  
+  The task could not save its results back to the database, or could not update its own
+  process state. Typical causes include write conflicts or storage errors.  


Typical causes include write conflicts or storage errors.

Write conflicts?. @Lwiel please take a look

arekpalinski · 2026-05-25T12:27:54Z

+
+* Each task keeps two dedicated tables on disk: one for item errors and one for process
+  errors.  
+  ETL and AI task errors are kept in separate storage and don't share these tables.  


More precisely each ETL or AI task keeps its errors in separate tables

arekpalinski · 2026-05-25T12:28:43Z

+* Retention is per task and per table, so a single noisy task cannot push errors out of
+  an unrelated task.  
+
+* Errors are also included in the server's debug package as `etl.errors.json` (for


Is it worth mentioning? It's very detailed info about Debug Package

arekpalinski · 2026-05-25T12:30:31Z

+
+RavenDB watches the ratio between a task's failed items and the total number of items the
+task has attempted to process. The ratio is computed as an EWMA (Exponentially Weighted
+Moving Average) and is updated continuously as new batches complete.  


Is it worth adding it's time agnostic EWMA? @Lwiel

arekpalinski · 2026-05-25T12:33:44Z

+  in the [HTTP endpoints](../../server/troubleshooting/debug-routes.mdx#debug-endpoints),
+  in the [SNMP OIDs](../../server/administration/snmp/snmp-overview.mdx#list-of-oids),
+  in the [Prometheus metrics](../../server/administration/monitoring/prometheus.mdx#metrics-provided-by-the-prometheus-endpoint),
+  and in the [JSON monitoring endpoints](../../server/administration/monitoring/telegraf.mdx#monitoring-endpoints).  


JSON monitoring endpoints

Is it how we officially call this feature? It thought it's Monitoring endpoints (https://docs.ravendb.net/7.2/server/administration/monitoring/telegraf#monitoring-endpoints)

RDoc-3843_taskErrors - document new task errors views and functionality

140fdd1

reebhub requested a review from Lwiel May 19, 2026 01:54

RDoc-3854 - reference task errors pages from AI task pages (GenAI, ge…

1c33dd2

…nerating embeddings)

Lwiel reviewed May 20, 2026

View reviewed changes

reebhub added 3 commits May 20, 2026 23:19

RDoc-3861 - update Ongoing Tasks overview (studio/database/tasks/ongo…

c48dce4

…ing-tasks/general-info) to use it as a reference from task errors pages

RDoc-3843 - fixed by review comments

ad26b23

RDoc-3851 - Document new task monitoring indicators on the Ongoing Ta…

34e86b0

…sks view

Lwiel reviewed May 22, 2026

View reviewed changes

reebhub added 2 commits May 22, 2026 14:28

RDoc-3843 - fixed by review comment: two separate error files (ETL, A…

0097e7f

…I) for debug package

RDoc-3843 - dropped mentions of AI tasks being based on the ETL model

c20f1a1

Lwiel approved these changes May 22, 2026

View reviewed changes

arekpalinski reviewed May 25, 2026

View reviewed changes

Conversation

reebhub commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue link

Type of change

Changes in docs URLs

Changes in UX/UI

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

reebhub commented May 19, 2026 •

edited

Loading