Skip to content

fix(DRC-2652): Handle missing tables gracefully in row count checks#1104

Open
seer-by-sentry[bot] wants to merge 10 commits intomainfrom
seer/fix/rowcount-missing-table-handling
Open

fix(DRC-2652): Handle missing tables gracefully in row count checks#1104
seer-by-sentry[bot] wants to merge 10 commits intomainfrom
seer/fix/rowcount-missing-table-handling

Conversation

@seer-by-sentry
Copy link

@seer-by-sentry seer-by-sentry bot commented Feb 5, 2026

PR checklist

  • Ensure you have added or ran the appropriate tests for your PR.
  • DCO signed

What type of PR is this?
Bug Fix

What this PR does / why we need it:
The issue was that: Base dbt manifest's 'mart_visits_incremental' model points to a non-existent Snowflake table in the base environment's schema.

  • Added logging to the rowcount.py module.
  • Implemented error handling in the _query_row_count method to catch database errors indicating a missing table or permission issues.
  • If a table is not found in the database, a warning is logged, and None is returned for the row count instead of raising an exception.
  • This improves the robustness of row count checks, preventing failures when a dbt model's corresponding table is absent in the database.

Which issue(s) this PR fixes:
Fixes RECCE-73M

Special notes for your reviewer:
This fix was generated by Seer in Sentry, triggered by Kent Chen. 👁️ Run ID: 9747187

Not quite right? Click here to continue debugging with Seer.

Does this PR introduce a user-facing change?:
NONE

@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 89.32584% with 19 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
recce/tasks/rowcount.py 68.85% 19 Missing ⚠️
Files with missing lines Coverage Δ
recce/mcp_server.py 52.40% <ø> (ø)
tests/tasks/test_row_count.py 100.00% <100.00%> (ø)
recce/tasks/rowcount.py 74.01% <68.85%> (+27.24%) ⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@iamcxa iamcxa self-assigned this Feb 5, 2026
iamcxa and others added 3 commits February 5, 2026 16:05
Improve row count diff results to include detailed status information
for each model, enabling agents to understand and report configuration
issues to users.

Changes:
- Add RowCountStatus class with status codes (ok, not_in_manifest,
  unsupported_resource_type, unsupported_materialization, table_not_found)
- Return structured results: {count, status, message?} instead of int|None
- Update MCP tool description to document response format and status codes
- Emphasize table_not_found status requires user notification about
  stale dbt artifacts or environment misconfiguration
- Add backward compatibility in RowCountDiffResultDiffer

Fixes RECCE-73M

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Add tests to improve coverage for row count status feature:
- Test _make_row_count_result helper function
- Test RowCountStatus values
- Test RowCountDiffResultDiffer with new and legacy formats
- Test RowCountDiffResultDiffer with None counts
- Test RowCountTask (current environment only)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Additional test coverage for:
- Unsupported resource type status
- Unsupported materialization status
- Table not found status format
- RowCountDiffResultDiffer._get_related_node_ids with various params
- RowCountDiffResultDiffer._get_changed_nodes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
@iamcxa iamcxa marked this pull request as ready for review February 5, 2026 08:37
@kiloconnect
Copy link

kiloconnect bot commented Feb 5, 2026

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Overview

Severity Count
SUGGESTION 1
Issue Details (click to expand)

SUGGESTION

File Line Issue
recce/tasks/rowcount.py 69 The _query_row_count method is duplicated between RowCountTask and RowCountDiffTask
Files Reviewed (3 files)
  • recce/mcp_server.py - No issues
  • recce/tasks/rowcount.py - 1 suggestion
  • tests/tasks/test_row_count.py - No issues

iamcxa and others added 3 commits February 5, 2026 23:16
List valid state selectors in tool description to prevent Agent
from using invalid selectors like 'state:added' (should be 'state:new').

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
@iamcxa iamcxa changed the title fix(checks): Handle missing tables gracefully in row count checks fix(DRC-2652): Handle missing tables gracefully in row count checks Feb 5, 2026
@iamcxa iamcxa requested a review from wcchang1115 February 5, 2026 20:03
Copy link
Collaborator

@wcchang1115 wcchang1115 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Kent,
The row count diff result will be consumed by the recce server frontend and the recce summary.
So the result structure needs to be considered.

self.check_cancel()
result[node] = {
"curr": row_count,
"curr": curr_result,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The row count result structure is changed.
And it will affect the result sent to the frontend.
So the row count diff check cannot work in recce server.

result[node] = {
"base": base_row_count,
"curr": curr_row_count,
"base": base_result,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to consider the structure change for the result consumer.

iamcxa and others added 2 commits February 6, 2026 16:44
Address PR review feedback:
- Move duplicated _query_row_count method from RowCountTask and
  RowCountDiffTask to a shared module-level function
- Update both classes to use the shared function
- Update test to match new function location

The result structure change already has backward compatibility
handling in RowCountDiffResultDiffer (lines 388-396) which handles
both the new structured format and legacy format.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Address code review feedback:

1. Convert RowCountStatus to str Enum for type safety while
   maintaining JSON serialization compatibility

2. Add type hints to _query_row_count function parameters:
   - dbt_adapter: Any
   - model_name: str
   - base: bool

3. Replace overly broad "NOT FOUND" error pattern with more
   specific patterns to avoid matching unrelated errors:
   - RELATION DOES NOT EXIST (PostgreSQL)
   - OBJECT DOES NOT EXIST (Snowflake)
   - INVALID OBJECT NAME (SQL Server)

4. Fix bug: self.node_ids -> self.params.node_ids in execute_sqlmesh

5. Minor: Use `in` operator for resource_type check

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the row count check functionality to gracefully handle missing database tables by returning structured error information instead of raising exceptions. The change was triggered by a Sentry error where a dbt manifest referenced a table that didn't exist in the Snowflake database.

Changes:

  • Introduced structured status codes (RowCountStatus enum) to communicate why row counts may be unavailable
  • Implemented comprehensive error handling to catch and classify database errors (table not found, permission issues, etc.)
  • Maintained backward compatibility with legacy result format while supporting new structured format
  • Enhanced MCP server documentation to help AI agents interpret status codes
  • Fixed a bug in SQLMesh adapter where self.node_ids should have been self.params.node_ids

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
recce/tasks/rowcount.py Core implementation: Added RowCountStatus enum, _make_row_count_result helper, refactored _query_row_count to return structured results with error handling, updated tasks to use new format, added backward compatibility in result differ, fixed SQLMesh node_ids access bug
tests/tasks/test_row_count.py Comprehensive test coverage: Updated existing tests for new format, added tests for status codes, backward compatibility, None handling, and helper functions
recce/mcp_server.py Documentation improvements: Enhanced tool descriptions with detailed status code explanations and dbt selector syntax guidance for MCP agents

Comment on lines +301 to +314
def test_query_row_count_table_not_found():
"""Test _query_row_count when table doesn't exist in database."""
# Simulate the TABLE_NOT_FOUND status result
result = _make_row_count_result(
status=RowCountStatus.TABLE_NOT_FOUND,
message="Table 'SCHEMA.model' not found in base database. "
"The model is defined in the dbt manifest but the table doesn't exist. "
"This may indicate stale dbt artifacts or an environment configuration issue.",
)

assert result["status"] == RowCountStatus.TABLE_NOT_FOUND
assert result["count"] is None
assert "stale dbt artifacts" in result["message"]
assert "environment configuration" in result["message"]
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for TABLE_NOT_FOUND status only verify the result format created by _make_row_count_result, but don't actually test the error detection logic in the exception handler (lines 107-137 in rowcount.py). Consider adding integration tests that actually trigger database errors (e.g., by creating a model reference to a non-existent table) to verify the error patterns correctly identify table-not-found scenarios and don't produce false positives for other error types.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point. Adding integration tests that trigger actual database errors requires more test infrastructure setup. I'll track this as a follow-up task for a separate PR.

from recce.tasks.core import CheckValidator, TaskResultDiffer
from recce.tasks.query import QueryMixin

logger = logging.getLogger(__name__)
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger is instantiated with logging.getLogger(__name__) but most of the codebase uses logging.getLogger("uvicorn") for consistency with the FastAPI server logging. Consider changing to logger = logging.getLogger("uvicorn") to align with the established logging convention seen in recce/adapter/base.py, recce/state/local.py, and recce/core.py.

Suggested change
logger = logging.getLogger(__name__)
logger = logging.getLogger("uvicorn")

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logging.getLogger(__name__) is standard Python best practice - it creates hierarchical logger names (e.g., recce.tasks.rowcount) for easier debugging and per-module log level control. The uvicorn logger is for the web server layer, not task modules.

Changes based on Copilot review:

1. Add PERMISSION_DENIED status code to distinguish authorization
   errors from missing table errors (#5)

2. Add defensive check for node.config.materialized is not None
   to handle edge cases (#2)

3. Remove unnecessary raw string prefix from SQL template (#3)

4. Fix SQLMesh initialization - only set TABLE_NOT_FOUND after
   query actually fails, not as misleading default (#7)

5. Add test for PERMISSION_DENIED status

Skipped suggestions:
- #1: Integration tests (better as separate PR)
- #4: Logger convention (__name__ is standard Python)
- #6: Error pattern specific enough for count(*) context

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Kent <iamcxa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants