Skip to content

[PLUGIN-1950] Log warning when a table has no records to read#73

Open
psainics wants to merge 1 commit intodata-integrations:developfrom
cloudsufi:feat/plugin-1950
Open

[PLUGIN-1950] Log warning when a table has no records to read#73
psainics wants to merge 1 commit intodata-integrations:developfrom
cloudsufi:feat/plugin-1950

Conversation

@psainics
Copy link
Copy Markdown
Contributor

@psainics psainics commented Apr 4, 2026

Log warning when a table has no records to read

Jira : Plugin-1950

Description

When the multi-table source reads from tables that contain zero rows, the lack of output can be confusing to debug. This adds a WARN-level log line with the table name in both DBTableRecordReader and SQLStatementRecordReader so operators can quickly identify empty tables/splits.

@psainics psainics self-assigned this Apr 6, 2026
@psainics psainics added the build label Apr 6, 2026
@psainics
Copy link
Copy Markdown
Contributor Author

psainics commented Apr 6, 2026

image

}
if (!results.next()) {
if (pos == 0) {
LOG.warn("Table '{}' had no records to read.", tableName.getTable());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 100% safe only when there is 1 split.

Imagine a table with 2,000 records, where the primary key ID has a massive gap. The records exist from ID = 1 to 1000, and ID = 3000 to 4000.

If your job calculates splits in chunks of 1000 IDs, you might get the following splits:

Split 1 (ID 1 - 1000): Has 1000 records.

Split 2 (ID 1001 - 2000): Has 0 records (Empty split).

Split 3 (ID 2001 - 3000): Has 0 records (Empty split).

Split 4 (ID 3001 - 4000): Has 1000 records.

When the RecordReader for Split 2 runs, its query returns an empty ResultSet. Since it is the first read attempt for that split (pos == 0), it will log: Table 'MyTable' had no records to read. even though the table actually contains 2,000 records.

So there is this corner case.

I am not sure if we can solve at a split level unless we make sure the no of splits is 1. or run a SELECT 1 FROM somewhere at a higher level.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sql query is single split, for table i am logging for each split.

@psainics psainics requested a review from sahusanket April 7, 2026 08:14
When the multi-table source reads from tables that contain zero rows,
the lack of output can be confusing to debug. This adds a WARN-level
log line with the table name in both DBTableRecordReader and
SQLStatementRecordReader so operators can quickly identify empty tables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants