Optimize database queries for friends balance summary and enrich member data #211

Devasy · 2025-12-07T09:01:54Z

Enhance performance by optimizing database queries for enriching member data, reducing the number of queries from multiple to a single batch fetch. This change improves efficiency and ensures faster data retrieval for user details.

Summary by CodeRabbit

Bug Fixes
- Resolved duplicate email conflicts to ensure account consistency and safe cleanup options.
Performance
- Consolidated friend balance computations into a single aggregated query with batched enrichment for faster, fewer round-trips.
- Batch-enriched group member lookups to reduce per-member queries.
Chores
- Added database index migration with verification and safe fallback handling.
Tests
- Updated and expanded tests to cover aggregated balances, enrichment fallbacks, and error paths.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…h member data

netlify · 2025-12-07T09:01:58Z

✅ Deploy Preview for split-but-wiser canceled.

Name	Link
🔨 Latest commit	`790ae8a`
🔍 Latest deploy log	https://app.netlify.com/projects/split-but-wiser/deploys/69354a0cdb7e31000853de82

coderabbitai · 2025-12-07T09:02:02Z

Walkthrough

Replaced per-item DB lookups with aggregated MongoDB pipelines and batch fetches in expenses and groups services; added two async DB migrations for indexes and duplicate-email remediation; updated tests to reflect the aggregation-based friends-balance flow and added fallback/error-handling behavior.

Changes

Cohort / File(s)	Summary
Expenses aggregation `backend/app/expenses/service.py`	Rewrote `get_friends_balance_summary` to use a single MongoDB aggregation to compute per-friend totals and per-group breakdowns, filter near-zero balances, batch-fetch friend details, build friend result objects (including rounded balances and breakdowns), compute aggregated summary totals, and add error handling and empty-result fallbacks.
Group member enrichment `backend/app/groups/service.py`	Reworked `_enrich_members_with_user_details` to batch-fetch users with one `$in` query, build a lookup map, enrich members from that map, preserve members without `userId`, and added a `_create_fallback_member(member: dict) -> dict` helper for consistent fallback enrichment.
Migrations `backend/migrations/001_create_indexes.py`, `backend/migrations/002_fix_duplicate_emails.py`	Added `001_create_indexes.py` (async `create_indexes()` and `drop_indexes_if_needed`) to create/verify indexes across multiple collections with error handling and logging; added `002_fix_duplicate_emails.py` with `find_duplicate_emails`, `fix_duplicates_auto`, `fix_duplicates_interactive`, and `main()` to detect and optionally resolve duplicate `users.email` entries (dry-run/interactive/auto modes).
Tests `backend/tests/expenses/test_expense_service.py`, `backend/tests/groups/test_enrich_members.py`	Updated expense service tests to expect a single aggregated `aggregate()` call and added cases for aggregation/user-fetch errors, zero-balance filtering, and negative balances; added comprehensive tests for `_enrich_members_with_user_details` covering batch-fetching, fallback creation, invalid IDs, missing users, and DB errors.

Possibly related PRs

feat: Add expense service with endpoints for creating, listing, and managing expenses #23 — Significant refactor touching get_friends_balance_summary and ExpenseService aggregation/enrichment logic (direct code overlap).
Add group settings functionality and enhance group details screen #128 — Overlaps with modifications to get_friends_balance_summary, including adding user imageUrl in aggregated friend results.
Implement friends balance retrieval and enhance HomeScreen actions #114 — Frontend changes that consume a server-side aggregated friends-balance endpoint; aligns with the new backend aggregation output.

Suggested reviewers

vrajpatelll

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: optimizing database queries for friends balance summary and enriching member data with batch fetches instead of per-item lookups.
Docstring Coverage	✅ Passed	Docstring coverage is 96.67% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8a6d92 and 83c4d1b.

📒 Files selected for processing (4)

backend/app/expenses/service.py (1 hunks)
backend/app/groups/service.py (1 hunks)
backend/migrations/001_create_indexes.py (1 hunks)
backend/migrations/002_fix_duplicate_emails.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

backend/app/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

backend/app/**/*.py: Backend API routes should be added to the appropriate service router file in backend/app/ based on the feature area (auth, user, groups, expenses)
Use RESTful API design patterns with FastAPI in the backend
Use MongoDB for persistent data storage in the backend with a nonrelational schema design

Files:

backend/app/groups/service.py
backend/app/expenses/service.py

🧠 Learnings (1)

📚 Learning: 2025-11-27T18:33:22.596Z

Learnt from: CR
Repo: Devasy23/splitwiser PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T18:33:22.596Z
Learning: Applies to backend/app/**/*.py : Use MongoDB for persistent data storage in the backend with a nonrelational schema design

Applied to files:

backend/migrations/001_create_indexes.py

🧬 Code graph analysis (1)

backend/app/groups/service.py (2)

backend/app/auth/service.py (1)

get_db (77-81)

backend/app/user/service.py (1)

get_db (13-14)

🪛 GitHub Actions: Run Backend Tests & Analytics

backend/app/expenses/service.py

[error] 1096-1096: KeyError: 'totalBalance' when accessing result in get_friends_balance_summary. The function expects 'totalBalance' in the result but it is missing.

🪛 Ruff (0.14.7)

backend/app/groups/service.py

70-70: Do not catch blind exception: Exception

(BLE001)

backend/migrations/001_create_indexes.py

52-52: f-string without any placeholders

Remove extraneous f prefix

(F541)

55-55: f-string without any placeholders

Remove extraneous f prefix

(F541)

61-61: Do not catch blind exception: Exception

(BLE001)

backend/migrations/002_fix_duplicate_emails.py

68-68: f-string without any placeholders

Remove extraneous f prefix

(F541)

98-98: Use of datetime.datetime.min without timezone information

(DTZ901)

117-117: f-string without any placeholders

Remove extraneous f prefix

(F541)

119-119: f-string without any placeholders

Remove extraneous f prefix

(F541)

142-142: f-string without any placeholders

Remove extraneous f prefix

(F541)

264-264: Do not catch blind exception: Exception

(BLE001)

backend/app/expenses/service.py

959-959: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)

960-960: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)

1055-1055: Do not catch blind exception: Exception

(BLE001)

1073-1073: Unnecessary generator (rewrite as a set comprehension)

Rewrite as a set comprehension

(C401)

1086-1086: Do not catch blind exception: Exception

(BLE001)

1132-1132: datetime.datetime.utcnow() used

(DTZ003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: preview

🔇 Additional comments (11)

backend/app/groups/service.py (2)

27-52: LGTM on the batch ID collection logic.

The early return for empty members and the fallback path when no valid user IDs exist are well-handled. Good defensive coding with the InvalidId exception handling.

75-118: LGTM on the enrichment logic and fallback helper.

The lookup map approach provides O(1) access per member, and the fallback helper ensures consistent data structure when user lookup fails.

backend/migrations/002_fix_duplicate_emails.py (3)

29-80: LGTM on the duplicate detection logic.

The aggregation pipeline correctly identifies duplicate emails by grouping and filtering for count > 1. Good logging provides visibility into the duplicates found.

127-193: Interactive mode shares the same orphan data concern.

The same cascading deletion issue applies here. Consider consolidating the deletion logic into a shared helper that handles related data cleanup.

196-268: LGTM on the main function structure.

Proper argument parsing, confirmation prompt for destructive operations, and guaranteed client cleanup in the finally block. The broad exception catch at the top level is acceptable for migration scripts.

backend/migrations/001_create_indexes.py (2)

45-65: Good fallback handling for unique email index.

The fallback to a non-unique index when duplicates exist ensures the migration doesn't fail outright, while still providing query performance benefits. This pairs well with the new 002_fix_duplicate_emails.py migration.

282-285: LGTM on the migration entry point and structure.

The script provides comprehensive logging, verification, and proper resource cleanup. This follows good migration script patterns.

backend/app/expenses/service.py (4)

955-981: LGTM on early return handling.

The early exit when the user has no groups avoids unnecessary queries and returns a consistent zeroed structure. The optimization documentation is helpful.

993-1048: Aggregation pipeline structure is correct.

The four-stage pipeline efficiently calculates per-friend balances across groups. The $cond logic correctly determines balance direction based on payer/payee roles.

1050-1070: Good error handling for aggregation execution.

The try/except with fallback to empty results ensures the endpoint remains functional even if the aggregation fails. The early return for empty results avoids unnecessary processing.

1101-1116: LGTM on breakdown construction.

The safe access with .get("groups", []) and the balance threshold filtering are correctly implemented.

backend/app/expenses/service.py

backend/app/groups/service.py

backend/migrations/001_create_indexes.py

backend/migrations/002_fix_duplicate_emails.py

codecov · 2025-12-07T09:07:15Z

Codecov Report

❌ Patch coverage is 98.41270% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 74.42%. Comparing base (c8a6d92) to head (790ae8a).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #211      +/-   ##
==========================================
+ Coverage   72.79%   74.42%   +1.62%     
==========================================
  Files          17       17              
  Lines        1669     1693      +24     
  Branches      154      158       +4     
==========================================
+ Hits         1215     1260      +45     
+ Misses        400      381      -19     
+ Partials       54       52       -2

Components	Coverage Δ
Authentication System	`75.45% <ø> (ø)`
Expense Management	`70.64% <97.50%> (+0.76%)`	⬆️
Group Management	`76.20% <100.00%> (+6.70%)`	⬆️
User Management	`97.16% <ø> (ø)`
Backend Core	`69.41% <ø> (ø)`

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83c4d1b and 171147e.

📒 Files selected for processing (1)

backend/tests/expenses/test_expense_service.py (3 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

backend/tests/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Backend tests should be located in /backend/tests/ and run with pytest

Files:

backend/tests/expenses/test_expense_service.py

🧬 Code graph analysis (1)

backend/tests/expenses/test_expense_service.py (1)

backend/tests/conftest.py (1)

mock_db (65-96)

🪛 Ruff (0.14.7)

backend/tests/expenses/test_expense_service.py

1615-1615: Missing return type annotation for private function sync_mock_settlements_aggregate_cursor_factory

(ANN202)

1615-1615: Unused function argument: pipeline

(ARG001)

1615-1615: Missing type annotation for *args

(ANN002)

1615-1615: Unused function argument: args

(ARG001)

1615-1615: Missing type annotation for **kwargs

(ANN003)

1615-1615: Unused function argument: kwargs

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: preview

🔇 Additional comments (2)

backend/tests/expenses/test_expense_service.py (2)

1605-1643: LGTM! Test correctly reflects optimized aggregation logic.

The mock data structure has been appropriately updated to reflect the single aggregate call optimization. The comments clearly explain how the new approach consolidates all friends' balances into one result set, and the mock structure correctly represents the unified response format with _id (friend ID), totalBalance, and per-group groups array.

1734-1736: Excellent! Assertion correctly validates the optimization.

The updated assertion confirms that the settlement aggregation now uses a single database call instead of multiple calls, which is the core improvement described in the PR objectives. The comment clearly documents this optimization.

backend/tests/expenses/test_expense_service.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

…e balance retrieval

…vice

…riends' balance summary

…rage and logging

coderabbitai

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 171147e and 790ae8a.

📒 Files selected for processing (4)

backend/app/expenses/service.py (2 hunks)
backend/app/groups/service.py (1 hunks)
backend/tests/expenses/test_expense_service.py (5 hunks)
backend/tests/groups/test_enrich_members.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

backend/app/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

backend/app/**/*.py: Backend API routes should be added to the appropriate service router file in backend/app/ based on the feature area (auth, user, groups, expenses)
Use RESTful API design patterns with FastAPI in the backend
Use MongoDB for persistent data storage in the backend with a nonrelational schema design

Files:

backend/app/expenses/service.py
backend/app/groups/service.py

backend/tests/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Backend tests should be located in /backend/tests/ and run with pytest

Files:

backend/tests/expenses/test_expense_service.py
backend/tests/groups/test_enrich_members.py

🧬 Code graph analysis (2)

backend/app/groups/service.py (2)

backend/app/auth/service.py (1)

get_db (77-81)

backend/app/user/service.py (1)

get_db (13-14)

backend/tests/expenses/test_expense_service.py (1)

backend/app/expenses/service.py (1)

get_friends_balance_summary (955-1156)

🪛 Ruff (0.14.7)

backend/app/expenses/service.py

959-959: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)

960-960: Docstring contains ambiguous × (MULTIPLICATION SIGN). Did you mean x (LATIN SMALL LETTER X)?

(RUF002)

1055-1055: Do not catch blind exception: Exception

(BLE001)

1086-1086: Do not catch blind exception: Exception

(BLE001)

backend/app/groups/service.py

70-70: Do not catch blind exception: Exception

(BLE001)

backend/tests/expenses/test_expense_service.py

1617-1617: Dynamically typed expressions (typing.Any) are disallowed in _pipeline

(ANN401)

1617-1617: Dynamically typed expressions (typing.Any) are disallowed in *_args

(ANN401)

1617-1617: Dynamically typed expressions (typing.Any) are disallowed in **_kwargs

(ANN401)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: preview

🔇 Additional comments (21)

backend/app/groups/service.py (4)

27-35: LGTM on the docstring and early return.

Good documentation with clear performance expectations (O(1) queries). The early return for empty members avoids unnecessary database calls.

40-52: LGTM on user ID extraction with proper error handling.

The code correctly handles invalid ObjectIds by logging warnings and collecting only valid IDs. Returning fallback members when no valid IDs exist is a reasonable defensive approach.

75-103: LGTM on the enrichment loop.

The batch lookup approach correctly enriches members from the pre-built map with appropriate fallbacks for missing users. The O(1) dictionary lookups per member provide good performance.

106-118: LGTM on the fallback member helper.

The helper correctly handles edge cases including short user IDs (length < 4) and missing userId fields. The conditional slicing user_id[-4:] if len(user_id) >= 4 else user_id is a nice defensive pattern.

backend/tests/expenses/test_expense_service.py (6)

1616-1647: LGTM on the mock factory function.

The function signature with underscore-prefixed unused parameters (_pipeline, _args, _kwargs) and return type annotation addresses the static analysis concerns from the previous review. The mock correctly returns aggregated results for all friends in a single response, matching the optimized implementation.

1737-1739: Good assertion verifying the batch optimization.

The test correctly validates that settlements.aggregate is called exactly once, confirming the optimization from N×M queries to a single aggregation pipeline.

2112-2145: Good test coverage for aggregation error handling.

This test validates that when the MongoDB aggregation fails, the service returns empty results with zeroed summary values rather than propagating the exception. This ensures resilience.

2147-2191: Good test for graceful degradation when user fetch fails.

The test correctly validates that balance data is still returned with "Unknown" user names when the batch user fetch fails. This matches the error handling in the service.

2193-2227: Good test for zero-balance filtering edge case.

This validates the aggregation pipeline's $match filter that excludes friends with near-zero balances (abs < 0.01).

2229-2276: Good test for negative balance calculation.

This test validates the totalYouOwe accumulation path when balances are negative, ensuring correct summary calculations for amounts the user owes.

backend/tests/groups/test_enrich_members.py (6)

17-71: LGTM on the success path test.

The test correctly validates batch enrichment by verifying:

All 3 members are enriched with correct user data

The query uses $in operator for batch fetching

Only one database call is made

72-82: LGTM on empty list test.

Good coverage of the early return path - verifies no database call is made for empty input.

84-108: LGTM on invalid ObjectId test.

Good coverage verifying logger warnings are called and fallback members are returned when all ObjectIds are invalid.

237-272: Excellent batch optimization verification test.

This test with 10 members explicitly validates the O(1) query optimization by asserting exactly one database call regardless of member count.

274-310: LGTM on fallback member helper tests.

Good coverage of edge cases: short user IDs, long user IDs (last 4 chars), and missing userId defaulting to "unknown".

199-235: Unable to verify: Test file and implementation not found in repository.

The test file backend/tests/groups/test_enrich_members.py and the _enrich_members_with_user_details implementation are not present in the repository. Without access to the actual code, the review comment's claims about implementation behavior and data loss cannot be verified.

backend/app/expenses/service.py (5)

967-981: LGTM on early exit for users with no groups.

Good defensive handling that returns a properly structured empty response when the user isn't in any groups.

993-1048: Well-designed aggregation pipeline.

The pipeline efficiently computes friend balances in a single query:

Matches settlements in user's groups involving the user

Groups by friend+group to compute per-group balances

Regroups by friend for total balance with breakdown

Filters out near-zero balances

The sign convention (positive = friend owes user, negative = user owes friend) is consistent throughout.

1079-1088: Good batch fetch with proper error handling.

The code correctly batch-fetches friend details and falls back to an empty map on failure, ensuring the method can still return partial results with "Unknown" user names.

1094-1096: Good defensive access pattern.

Using result.get("totalBalance", 0) addresses the previous review's KeyError concern and handles edge cases where the aggregation might not produce expected fields.

1145-1156: LGTM on the response structure.

The summary correctly aggregates totals and the response structure matches the documented return type. The rounding to 2 decimal places ensures clean financial values.

backend/app/expenses/service.py

backend/app/groups/service.py

backend/tests/groups/test_enrich_members.py

feat: optimize database queries for friends balance summary and enric…

83c4d1b

…h member data

Devasy self-assigned this Dec 7, 2025

Devasy requested review from aasthagaudani, dwijbavisi and vrajpatelll December 7, 2025 09:02

feat: optimize settlement aggregation for friends' balance summary

171147e

coderabbitai bot reviewed Dec 7, 2025

View reviewed changes

backend/tests/expenses/test_expense_service.py Outdated Show resolved Hide resolved

Devasy and others added 6 commits December 7, 2025 14:39

Update backend/app/expenses/service.py

57d2253

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update backend/app/groups/service.py

9cdc317

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

feat: enhance expense service with timezone-aware datetime and improv…

da9383e

…e balance retrieval

feat: add tests for optimized member enrichment function in group ser…

338722a

…vice

feat: add tests for handling zero and negative balance scenarios in f…

c16925e

…riends' balance summary

feat: enhance tests for member enrichment function with improved cove…

790ae8a

…rage and logging

coderabbitai bot reviewed Dec 7, 2025

View reviewed changes

Devasy merged commit a035ec0 into main Dec 7, 2025
38 checks passed

Devasy deleted the feat/optimise-db-queries branch December 7, 2025 09:39

Optimize database queries for friends balance summary and enrich member data #211

Optimize database queries for friends balance summary and enrich member data #211

Uh oh!

Conversation

Devasy commented Dec 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

netlify bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for split-but-wiser canceled.

Uh oh!

coderabbitai bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Devasy commented Dec 7, 2025 •

edited by coderabbitai bot

Loading

netlify bot commented Dec 7, 2025 •

edited

Loading

coderabbitai bot commented Dec 7, 2025 •

edited

Loading

codecov bot commented Dec 7, 2025 •

edited

Loading