-
Notifications
You must be signed in to change notification settings - Fork 24
Optimize database queries for friends balance summary and enrich member data #211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for split-but-wiser canceled.
|
WalkthroughReplaced per-item DB lookups with aggregated MongoDB pipelines and batch fetches in expenses and groups services; added two async DB migrations for indexes and duplicate-email remediation; updated tests to reflect the aggregation-based friends-balance flow and added fallback/error-handling behavior. Changes
Possibly related PRs
Suggested reviewers
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
backend/app/expenses/service.py(1 hunks)backend/app/groups/service.py(1 hunks)backend/migrations/001_create_indexes.py(1 hunks)backend/migrations/002_fix_duplicate_emails.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
backend/app/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
backend/app/**/*.py: Backend API routes should be added to the appropriate service router file inbackend/app/based on the feature area (auth, user, groups, expenses)
Use RESTful API design patterns with FastAPI in the backend
Use MongoDB for persistent data storage in the backend with a nonrelational schema design
Files:
backend/app/groups/service.pybackend/app/expenses/service.py
🧠 Learnings (1)
📚 Learning: 2025-11-27T18:33:22.596Z
Learnt from: CR
Repo: Devasy23/splitwiser PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T18:33:22.596Z
Learning: Applies to backend/app/**/*.py : Use MongoDB for persistent data storage in the backend with a nonrelational schema design
Applied to files:
backend/migrations/001_create_indexes.py
🧬 Code graph analysis (1)
backend/app/groups/service.py (2)
backend/app/auth/service.py (1)
get_db(77-81)backend/app/user/service.py (1)
get_db(13-14)
🪛 GitHub Actions: Run Backend Tests & Analytics
backend/app/expenses/service.py
[error] 1096-1096: KeyError: 'totalBalance' when accessing result in get_friends_balance_summary. The function expects 'totalBalance' in the result but it is missing.
🪛 Ruff (0.14.7)
backend/app/groups/service.py
70-70: Do not catch blind exception: Exception
(BLE001)
backend/migrations/001_create_indexes.py
52-52: f-string without any placeholders
Remove extraneous f prefix
(F541)
55-55: f-string without any placeholders
Remove extraneous f prefix
(F541)
61-61: Do not catch blind exception: Exception
(BLE001)
backend/migrations/002_fix_duplicate_emails.py
68-68: f-string without any placeholders
Remove extraneous f prefix
(F541)
98-98: Use of datetime.datetime.min without timezone information
(DTZ901)
117-117: f-string without any placeholders
Remove extraneous f prefix
(F541)
119-119: f-string without any placeholders
Remove extraneous f prefix
(F541)
142-142: f-string without any placeholders
Remove extraneous f prefix
(F541)
264-264: Do not catch blind exception: Exception
(BLE001)
backend/app/expenses/service.py
959-959: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF002)
960-960: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF002)
1055-1055: Do not catch blind exception: Exception
(BLE001)
1073-1073: Unnecessary generator (rewrite as a set comprehension)
Rewrite as a set comprehension
(C401)
1086-1086: Do not catch blind exception: Exception
(BLE001)
1132-1132: datetime.datetime.utcnow() used
(DTZ003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: preview
🔇 Additional comments (11)
backend/app/groups/service.py (2)
27-52: LGTM on the batch ID collection logic.The early return for empty members and the fallback path when no valid user IDs exist are well-handled. Good defensive coding with the
InvalidIdexception handling.
75-118: LGTM on the enrichment logic and fallback helper.The lookup map approach provides O(1) access per member, and the fallback helper ensures consistent data structure when user lookup fails.
backend/migrations/002_fix_duplicate_emails.py (3)
29-80: LGTM on the duplicate detection logic.The aggregation pipeline correctly identifies duplicate emails by grouping and filtering for count > 1. Good logging provides visibility into the duplicates found.
127-193: Interactive mode shares the same orphan data concern.The same cascading deletion issue applies here. Consider consolidating the deletion logic into a shared helper that handles related data cleanup.
196-268: LGTM on the main function structure.Proper argument parsing, confirmation prompt for destructive operations, and guaranteed client cleanup in the
finallyblock. The broad exception catch at the top level is acceptable for migration scripts.backend/migrations/001_create_indexes.py (2)
45-65: Good fallback handling for unique email index.The fallback to a non-unique index when duplicates exist ensures the migration doesn't fail outright, while still providing query performance benefits. This pairs well with the new
002_fix_duplicate_emails.pymigration.
282-285: LGTM on the migration entry point and structure.The script provides comprehensive logging, verification, and proper resource cleanup. This follows good migration script patterns.
backend/app/expenses/service.py (4)
955-981: LGTM on early return handling.The early exit when the user has no groups avoids unnecessary queries and returns a consistent zeroed structure. The optimization documentation is helpful.
993-1048: Aggregation pipeline structure is correct.The four-stage pipeline efficiently calculates per-friend balances across groups. The
$condlogic correctly determines balance direction based on payer/payee roles.
1050-1070: Good error handling for aggregation execution.The try/except with fallback to empty results ensures the endpoint remains functional even if the aggregation fails. The early return for empty results avoids unnecessary processing.
1101-1116: LGTM on breakdown construction.The safe access with
.get("groups", [])and the balance threshold filtering are correctly implemented.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #211 +/- ##
==========================================
+ Coverage 72.79% 74.42% +1.62%
==========================================
Files 17 17
Lines 1669 1693 +24
Branches 154 158 +4
==========================================
+ Hits 1215 1260 +45
+ Misses 400 381 -19
+ Partials 54 52 -2
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
backend/tests/expenses/test_expense_service.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
backend/tests/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Backend tests should be located in
/backend/tests/and run with pytest
Files:
backend/tests/expenses/test_expense_service.py
🧬 Code graph analysis (1)
backend/tests/expenses/test_expense_service.py (1)
backend/tests/conftest.py (1)
mock_db(65-96)
🪛 Ruff (0.14.7)
backend/tests/expenses/test_expense_service.py
1615-1615: Missing return type annotation for private function sync_mock_settlements_aggregate_cursor_factory
(ANN202)
1615-1615: Unused function argument: pipeline
(ARG001)
1615-1615: Missing type annotation for *args
(ANN002)
1615-1615: Unused function argument: args
(ARG001)
1615-1615: Missing type annotation for **kwargs
(ANN003)
1615-1615: Unused function argument: kwargs
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: preview
🔇 Additional comments (2)
backend/tests/expenses/test_expense_service.py (2)
1605-1643: LGTM! Test correctly reflects optimized aggregation logic.The mock data structure has been appropriately updated to reflect the single aggregate call optimization. The comments clearly explain how the new approach consolidates all friends' balances into one result set, and the mock structure correctly represents the unified response format with
_id(friend ID),totalBalance, and per-groupgroupsarray.
1734-1736: Excellent! Assertion correctly validates the optimization.The updated assertion confirms that the settlement aggregation now uses a single database call instead of multiple calls, which is the core improvement described in the PR objectives. The comment clearly documents this optimization.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…e balance retrieval
…riends' balance summary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
backend/app/expenses/service.py(2 hunks)backend/app/groups/service.py(1 hunks)backend/tests/expenses/test_expense_service.py(5 hunks)backend/tests/groups/test_enrich_members.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
backend/app/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
backend/app/**/*.py: Backend API routes should be added to the appropriate service router file inbackend/app/based on the feature area (auth, user, groups, expenses)
Use RESTful API design patterns with FastAPI in the backend
Use MongoDB for persistent data storage in the backend with a nonrelational schema design
Files:
backend/app/expenses/service.pybackend/app/groups/service.py
backend/tests/**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Backend tests should be located in
/backend/tests/and run with pytest
Files:
backend/tests/expenses/test_expense_service.pybackend/tests/groups/test_enrich_members.py
🧬 Code graph analysis (2)
backend/app/groups/service.py (2)
backend/app/auth/service.py (1)
get_db(77-81)backend/app/user/service.py (1)
get_db(13-14)
backend/tests/expenses/test_expense_service.py (1)
backend/app/expenses/service.py (1)
get_friends_balance_summary(955-1156)
🪛 Ruff (0.14.7)
backend/app/expenses/service.py
959-959: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF002)
960-960: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?
(RUF002)
1055-1055: Do not catch blind exception: Exception
(BLE001)
1086-1086: Do not catch blind exception: Exception
(BLE001)
backend/app/groups/service.py
70-70: Do not catch blind exception: Exception
(BLE001)
backend/tests/expenses/test_expense_service.py
1617-1617: Dynamically typed expressions (typing.Any) are disallowed in _pipeline
(ANN401)
1617-1617: Dynamically typed expressions (typing.Any) are disallowed in *_args
(ANN401)
1617-1617: Dynamically typed expressions (typing.Any) are disallowed in **_kwargs
(ANN401)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: preview
🔇 Additional comments (21)
backend/app/groups/service.py (4)
27-35: LGTM on the docstring and early return.Good documentation with clear performance expectations (O(1) queries). The early return for empty members avoids unnecessary database calls.
40-52: LGTM on user ID extraction with proper error handling.The code correctly handles invalid ObjectIds by logging warnings and collecting only valid IDs. Returning fallback members when no valid IDs exist is a reasonable defensive approach.
75-103: LGTM on the enrichment loop.The batch lookup approach correctly enriches members from the pre-built map with appropriate fallbacks for missing users. The O(1) dictionary lookups per member provide good performance.
106-118: LGTM on the fallback member helper.The helper correctly handles edge cases including short user IDs (length < 4) and missing userId fields. The conditional slicing
user_id[-4:] if len(user_id) >= 4 else user_idis a nice defensive pattern.backend/tests/expenses/test_expense_service.py (6)
1616-1647: LGTM on the mock factory function.The function signature with underscore-prefixed unused parameters (
_pipeline,_args,_kwargs) and return type annotation addresses the static analysis concerns from the previous review. The mock correctly returns aggregated results for all friends in a single response, matching the optimized implementation.
1737-1739: Good assertion verifying the batch optimization.The test correctly validates that
settlements.aggregateis called exactly once, confirming the optimization from N×M queries to a single aggregation pipeline.
2112-2145: Good test coverage for aggregation error handling.This test validates that when the MongoDB aggregation fails, the service returns empty results with zeroed summary values rather than propagating the exception. This ensures resilience.
2147-2191: Good test for graceful degradation when user fetch fails.The test correctly validates that balance data is still returned with "Unknown" user names when the batch user fetch fails. This matches the error handling in the service.
2193-2227: Good test for zero-balance filtering edge case.This validates the aggregation pipeline's
$matchfilter that excludes friends with near-zero balances (abs < 0.01).
2229-2276: Good test for negative balance calculation.This test validates the
totalYouOweaccumulation path when balances are negative, ensuring correct summary calculations for amounts the user owes.backend/tests/groups/test_enrich_members.py (6)
17-71: LGTM on the success path test.The test correctly validates batch enrichment by verifying:
- All 3 members are enriched with correct user data
- The query uses
$inoperator for batch fetching- Only one database call is made
72-82: LGTM on empty list test.Good coverage of the early return path - verifies no database call is made for empty input.
84-108: LGTM on invalid ObjectId test.Good coverage verifying logger warnings are called and fallback members are returned when all ObjectIds are invalid.
237-272: Excellent batch optimization verification test.This test with 10 members explicitly validates the O(1) query optimization by asserting exactly one database call regardless of member count.
274-310: LGTM on fallback member helper tests.Good coverage of edge cases: short user IDs, long user IDs (last 4 chars), and missing userId defaulting to "unknown".
199-235: Unable to verify: Test file and implementation not found in repository.The test file
backend/tests/groups/test_enrich_members.pyand the_enrich_members_with_user_detailsimplementation are not present in the repository. Without access to the actual code, the review comment's claims about implementation behavior and data loss cannot be verified.backend/app/expenses/service.py (5)
967-981: LGTM on early exit for users with no groups.Good defensive handling that returns a properly structured empty response when the user isn't in any groups.
993-1048: Well-designed aggregation pipeline.The pipeline efficiently computes friend balances in a single query:
- Matches settlements in user's groups involving the user
- Groups by friend+group to compute per-group balances
- Regroups by friend for total balance with breakdown
- Filters out near-zero balances
The sign convention (positive = friend owes user, negative = user owes friend) is consistent throughout.
1079-1088: Good batch fetch with proper error handling.The code correctly batch-fetches friend details and falls back to an empty map on failure, ensuring the method can still return partial results with "Unknown" user names.
1094-1096: Good defensive access pattern.Using
result.get("totalBalance", 0)addresses the previous review's KeyError concern and handles edge cases where the aggregation might not produce expected fields.
1145-1156: LGTM on the response structure.The summary correctly aggregates totals and the response structure matches the documented return type. The rounding to 2 decimal places ensures clean financial values.
Enhance performance by optimizing database queries for enriching member data, reducing the number of queries from multiple to a single batch fetch. This change improves efficiency and ensures faster data retrieval for user details.
Summary by CodeRabbit
Bug Fixes
Performance
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.