-
Notifications
You must be signed in to change notification settings - Fork 0
chore: store large numbers as text in sqlite #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis PR introduces comprehensive SQLite support across the application by adding conditional database backend detection and branching logic. New utility functions and query models handle string-based balance representations for SQLite while maintaining numeric types for PostgreSQL. Schema migrations and data persistence operations now differentiate between backends. Token parsing recognizes additional discriminators. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–75 minutes Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
src/migration/migrations/standard/m20220101_000001_init.rs (1)
15-24: Consider extracting the duplicatedexecute_sqlhelper.This helper function is duplicated across multiple migration files (m20220101_000001_init.rs, m20240914_000005_init.rs, m20241008_000006_init.rs, m20250206_000007_init.rs, custom20252201_000001_init.rs). Consider extracting it to a shared module to reduce duplication.
// In a shared module like src/migration/utils.rs: pub async fn execute_sql(manager: &SchemaManager<'_>, sql: &str) -> Result<(), DbErr> { manager .get_connection() .execute(Statement::from_string( manager.get_database_backend(), sql.to_string(), )) .await?; Ok(()) }src/api/method/utils.rs (1)
38-44: Edge case:parse_balance_stringmay produce confusing error message.If the input string is "123.456", after splitting on '.',
valuebecomes "123". If parsing fails, the error message shows "123" not the original "123.456", which could confuse debugging. Consider preserving the original value in the error message.pub fn parse_balance_string(value: &str) -> Result<u64, PhotonApiError> { + let original = value; // Handle potential decimal point from Decimal serialization let value = value.split('.').next().unwrap_or(value); value .parse::<u64>() - .map_err(|_| PhotonApiError::UnexpectedError(format!("Invalid balance: {}", value))) + .map_err(|_| PhotonApiError::UnexpectedError(format!("Invalid balance: {}", original))) }src/common/typedefs/account/v2.rs (1)
44-57: Identical discriminator logic duplicated from v1.rs.The
parse_token_datamethod in bothAccount(v1.rs) andAccountV2(v2.rs) contains identical discriminator detection logic. Consider extracting this to a shared helper function.// In a shared location, e.g., src/common/typedefs/token_data.rs or a new utils module: pub fn is_token_account(owner: &Pubkey, discriminator: u64) -> bool { let bytes = discriminator.to_le_bytes(); let is_v1 = bytes == [2, 0, 0, 0, 0, 0, 0, 0]; let is_v2 = bytes == [0, 0, 0, 0, 0, 0, 0, 3]; let is_sha_flat = bytes == [0, 0, 0, 0, 0, 0, 0, 4]; *owner == COMPRESSED_TOKEN_PROGRAM && (is_v1 || is_v2 || is_sha_flat) } // Then in both v1.rs and v2.rs: pub fn parse_token_data(&self) -> Result<Option<TokenData>, IngesterError> { match self.data.as_ref() { Some(data) if is_token_account(&self.owner.0, data.discriminator.0) => { let token_data = TokenData::try_from_slice(data.data.0.as_slice()) .map_err(|e| IngesterError::ParserError(format!("Failed to parse token data: {:?}", e)))?; Ok(Some(token_data)) } _ => Ok(None), } }src/api/method/get_compressed_balance_by_owner.rs (1)
27-49: SQLite branching logic looks correct.Both paths correctly handle the balance retrieval with appropriate model types and parsing functions. The logic is consistent with the pattern used across this PR.
Consider extracting the common query builder logic into a helper to reduce duplication, as the only differences are the model type and parse function. This would improve maintainability when query logic needs to change.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (13)
.gitignore(1 hunks)Cargo.toml(1 hunks)src/api/method/get_compressed_account_balance.rs(2 hunks)src/api/method/get_compressed_balance_by_owner.rs(2 hunks)src/api/method/get_compressed_mint_token_holders.rs(2 hunks)src/api/method/get_compressed_token_account_balance.rs(2 hunks)src/api/method/get_compressed_token_balances_by_owner.rs(2 hunks)src/api/method/utils.rs(4 hunks)src/common/typedefs/account/v1.rs(1 hunks)src/common/typedefs/account/v2.rs(1 hunks)src/ingester/persist/mod.rs(6 hunks)src/migration/migrations/standard/m20220101_000001_init.rs(1 hunks)src/migration/migrations/standard/m20250206_000007_init.rs(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
src/api/method/get_compressed_balance_by_owner.rs (1)
src/api/method/utils.rs (3)
is_sqlite(46-48)parse_balance_string(38-44)parse_decimal(31-36)
src/api/method/get_compressed_mint_token_holders.rs (1)
src/api/method/utils.rs (4)
is_sqlite(46-48)parse_balance_string(38-44)parse_decimal(31-36)filter(279-296)
src/api/method/get_compressed_token_account_balance.rs (1)
src/api/method/utils.rs (3)
is_sqlite(46-48)parse_balance_string(38-44)parse_decimal(31-36)
src/api/method/get_compressed_account_balance.rs (1)
src/api/method/utils.rs (3)
is_sqlite(46-48)parse_balance_string(38-44)parse_decimal(31-36)
src/migration/migrations/standard/m20220101_000001_init.rs (4)
src/migration/migrations/standard/m20250206_000007_init.rs (1)
execute_sql(10-19)src/migration/migrations/custom/custom20252201_000001_init.rs (1)
execute_sql(12-21)src/migration/migrations/standard/m20240914_000005_init.rs (1)
execute_sql(9-18)src/migration/migrations/standard/m20241008_000006_init.rs (1)
execute_sql(9-18)
src/api/method/utils.rs (2)
src/common/typedefs/unsigned_integer.rs (1)
value(55-56)tests/integration_tests/utils.rs (2)
format(260-261)conn(282-291)
src/api/method/get_compressed_token_balances_by_owner.rs (3)
src/api/method/utils.rs (4)
is_sqlite(46-48)parse_balance_string(38-44)parse_decimal(31-36)filter(279-296)tests/integration_tests/utils.rs (1)
conn(282-291)src/api/method/get_compressed_mint_token_holders.rs (4)
token_owner_balances(57-57)token_owner_balances(78-78)token_owner_balances(85-103)token_owner_balances(105-123)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run tests
🔇 Additional comments (17)
Cargo.toml (1)
10-10: Version bump looks appropriate for the SQLite precision fix.Patch version increment aligns with the nature of this change—a fix/improvement for SQLite backend without breaking API changes.
.gitignore (1)
16-18: Sensible gitignore additions.Ignoring
.cursor(editor config) and**/photon.log(log files) keeps development artifacts out of version control.src/migration/migrations/standard/m20250206_000007_init.rs (2)
42-43: Good: Using TEXT for large integers preserves precision in SQLite.SQLite's REAL type loses precision beyond 2^53 (JavaScript's safe integer limit). Storing as TEXT and parsing in application code is the correct approach for u64 values like lamports.
244-245: Down migration correctly mirrors the TEXT column types.The rollback schema maintains consistency with the up migration's TEXT storage for lamports and discriminator.
src/migration/migrations/standard/m20220101_000001_init.rs (1)
278-303: Comprehensive SQLite precision handling for all large integer columns.All columns that could exceed SQLite's safe integer limit (2^53) are correctly changed to TEXT: lamports, discriminator, and amount across accounts, token_accounts, owner_balances, and token_owner_balances tables.
src/api/method/utils.rs (2)
46-48: Simple and clear backend detection.
335-368: New string-based models for SQLite are well-structured.The
*ModelStringvariants mirror theDecimal-based models and provide clear separation for backend-specific query handling.src/ingester/persist/mod.rs (5)
316-339: Helper functions for SQLite SQL formatting are well-implemented.Clean separation of concerns with dedicated functions for nullable values and boolean conversion. The hex formatting for bytes matches SQLite's blob literal syntax.
373-381: Good: Backend-aware balance parsing from query results.SQLite returns TEXT that needs string parsing, while PostgreSQL returns Decimal directly. The fallback to
Decimal::ZEROon parse failure is reasonable for robustness.
417-430: SQLite arithmetic on TEXT columns uses correct CAST pattern.The double CAST (
CAST(CAST(... AS INTEGER) + ... AS TEXT)) correctly handles arithmetic on TEXT-stored numbers and preserves the TEXT storage format.
481-555: SQLite raw SQL path correctly handles all account fields.The raw SQL construction for SQLite properly quotes TEXT values (lamports, discriminator) and uses appropriate formatters for all field types. The ON CONFLICT DO NOTHING matches the SeaORM equivalent.
579-628: Token accounts SQLite path mirrors the accounts pattern consistently.The same TEXT quoting pattern for
amountand proper formatting for all other fields ensures consistency between the two persistence paths.src/common/typedefs/account/v1.rs (1)
36-49: Discriminator patterns are internally consistent; external specification verification cannot be completed.The code in
src/common/typedefs/account/v1.rscorrectly encodes token discriminators:
- v1_token:
[2, 0, 0, 0, 0, 0, 0, 0]= little-endian u64 value2- v2_token:
[0, 0, 0, 0, 0, 0, 0, 3]= little-endian u64 value3 << 56- sha_flat_token:
[0, 0, 0, 0, 0, 0, 0, 4]= little-endian u64 value4 << 56These checks are:
- Identical in both
v1.rsandv2.rs(consistent duplication)- Properly using
to_le_bytes()to convertUnsignedInteger(u64 wrapper) to 8-byte little-endian arrays- No contradictions found in the codebase
However, the original review's request to verify these against Light Protocol's official token format specifications cannot be completed—no accessible external documentation confirms these are the canonical discriminator values. The codebase provides no comments explaining why these specific values represent each token variant.
src/api/method/get_compressed_token_balances_by_owner.rs (1)
85-123: LGTM!The SQLite-aware branching correctly uses the appropriate model types and parsing functions. Both paths maintain identical query semantics (filtering, ordering, limiting) while adapting to the database-specific balance representation.
src/api/method/get_compressed_account_balance.rs (1)
19-41: LGTM!The SQLite-aware branching correctly handles single account balance retrieval. The
.map().transpose()?.unwrap_or(0)pattern is idiomatic for handlingOption<Result<T, E>>and correctly defaults to 0 when no account is found.src/api/method/get_compressed_mint_token_holders.rs (1)
84-124: LGTM!The SQLite-aware branching correctly implements the token holders query with proper ordering (descending by amount, then owner) preserved in both paths. The model types and parsing functions are appropriately selected based on the database backend.
src/api/method/get_compressed_token_account_balance.rs (1)
37-59: LGTM!The SQLite-aware branching correctly handles single token account balance retrieval, consistent with the pattern used in
get_compressed_account_balance.rs. The.map().transpose()?.unwrap_or(0)handling is appropriate.
| let lamports = format!("'{}'", account.account.lamports.0); | ||
| let discriminator = match account.account.data.as_ref() { | ||
| Some(d) => format!("'{}'", d.discriminator.0), | ||
| None => "NULL".to_string(), | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for discriminator handling in write path (persist/mod.rs)
echo "=== Discriminator in persist/mod.rs ==="
rg -n "discriminator" src/ingester/persist/mod.rs -A3 -B3
echo ""
echo "=== Discriminator in api/method/ directory ==="
rg -n "discriminator" src/api/method/ -A3 -B3Repository: Lightprotocol/photon
Length of output: 4330
🏁 Script executed:
#!/bin/bash
# Check the full context around the conditional logic (lines 485-530)
echo "=== Full context of insert logic ==="
sed -n '485,535p' src/ingester/persist/mod.rs
echo ""
echo "=== Search for database schema definition ==="
find . -name "*.sql" -o -name "*migration*" -o -name "*schema*" | head -20
echo ""
echo "=== Search for discriminator column definition ==="
rg -n "discriminator" --type sqlRepository: Lightprotocol/photon
Length of output: 4346
🏁 Script executed:
#!/bin/bash
# List migration files
echo "=== Migration files ==="
ls -la src/migration/migrations/
echo ""
echo "=== Check for discriminator schema definition ==="
cat src/migration/migrations/*.sql 2>/dev/null | grep -A5 -B5 "discriminator" || echo "No SQL files found"
echo ""
echo "=== Search for balance-related logic ==="
rg -n "balance" src/ingester/persist/mod.rs -A3 -B3
echo ""
echo "=== Search for discriminator usage in balance or calculations ==="
rg -n "discriminator" src/ -A2 -B2 | grep -A5 -B5 "balance\|where\|WHERE\|select\|SELECT" || echo "No balance-related discriminator usage found"Repository: Lightprotocol/photon
Length of output: 5377
🏁 Script executed:
#!/bin/bash
# Search for where discriminator values are parsed/read after database retrieval
echo "=== How discriminator is extracted from query results ==="
rg -n "try_get.*discriminator|discriminator.*try_get" src/ -A2 -B2
echo ""
echo "=== Check api/method/utils.rs for discriminator handling ==="
sed -n '120,140p' src/api/method/utils.rs
echo ""
echo "=== Search for discriminator comparisons or operations ==="
rg -n "discriminator" src/api/ -A3 -B3 | head -80
echo ""
echo "=== Check the complete migration file for schema details ==="
ls -la src/migration/migrations/standard/Repository: Lightprotocol/photon
Length of output: 4175
🏁 Script executed:
#!/bin/bash
# Search for discriminator usage in proof generation and API responses
echo "=== Discriminator usage in responses ==="
rg -n "discriminator" src/api/ -B3 -A3 | grep -v "^--$" | head -60
echo ""
echo "=== Check how query results are mapped to response struct ==="
rg -n "CompressedAccountData\|TokenAccountData" src/api/method/ -A10 | head -80
echo ""
echo "=== Look for any string trimming or quote stripping logic ==="
rg -n "trim|strip|replace.*['\"]" src/ | grep -i discrimin
echo ""
echo "=== Verify actual discriminator storage in SQLite path ==="
grep -n "discriminator.0" src/ingester/persist/mod.rsRepository: Lightprotocol/photon
Length of output: 2389
🏁 Script executed:
#!/bin/bash
# Search for query result mapping and deserialization
echo "=== Search for map_row or result deserialization ==="
rg -n "map_row|from_row|TryFrom.*Row" src/api/method/ -A5
echo ""
echo "=== Look for CompressedAccountData struct definition and construction ==="
rg -n "impl.*CompressedAccountData\|CompressedAccountData {" src/ -A15 | head -100
echo ""
echo "=== Check if there's a custom from_row or deserialization ==="
rg -n "fn.*row.*discriminator|discriminator.*from" src/ -B5 -A5
echo ""
echo "=== Search in utils.rs for how data is constructed from query ==="
sed -n '140,200p' src/api/method/utils.rsRepository: Lightprotocol/photon
Length of output: 4183
🏁 Script executed:
#!/bin/bash
# Find accounts entity/model
echo "=== Find accounts model definition ==="
find . -name "*accounts*" -path "*/entities/*" -o -name "*accounts*" -path "*/model*" | head -10
echo ""
echo "=== Search for TryInto for accounts ==="
rg -n "impl.*TryInto.*accounts\|impl.*TryFrom.*accounts" src/ -A20 | head -100
echo ""
echo "=== Look for CompressedAccountData in entire codebase ==="
rg -n "CompressedAccountData" src/ -B2 -A5
echo ""
echo "=== Check the accounts table schema in migration ==="
grep -A30 "CREATE TABLE.*accounts" src/migration/migrations/standard/m20250206_000007_init.rsRepository: Lightprotocol/photon
Length of output: 5295
Critical data type mismatch in discriminator storage: [u8; 8] array truncated to single byte or numeric value
The discriminator field is defined as [u8; 8] (8-byte fixed array) in CompressedAccountData, but both database paths store it incorrectly:
- SQLite path (line 500):
format!("'{}'", d.discriminator.0)stores only the first byte as a quoted TEXT string (e.g.,'123'), losing the remaining 7 bytes - PostgreSQL path (line 528):
Decimal::from(x.discriminator.0)converts the first byte to a numeric type, incompatible with the byte array
When retrieved, SQLite returns the quoted ASCII representation (e.g., '123' as bytes) instead of the original 8-byte discriminator, causing proof generation and validation failures. Both implementations need to store the complete 8-byte discriminator array consistently.
🤖 Prompt for AI Agents
In src/ingester/persist/mod.rs around lines 498–502, the discriminator
(CompressedAccountData.discriminator: [u8;8]) is being stored incorrectly (only
the first byte as a quoted string for SQLite and converted to a numeric for
Postgres); change both DB paths to persist the entire 8-byte array consistently
(not via format!("'{}'", ...) or Decimal::from(...)). For SQLite write the raw 8
bytes as a BLOB (or a hex/base64 string consistently encoded) and bind the byte
slice directly in the insert/update statement; for Postgres use BYTEA (or the
same hex/base64 string) and bind the full 8-byte slice instead of converting to
Decimal. Update the SQL schema/columns if needed, use parameterized binding (no
string interpolation), and ensure read path decodes the stored bytes back to
[u8;8] the same way.
ananas-block
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add integration tests.
Overview
Testing
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.