perf(cll): skip Jinja compilation when compiled_code exists#1230
perf(cll): skip Jinja compilation when compiled_code exists#1230wcchang1115 merged 3 commits intomainfrom
Conversation
When dbt compile has already been run, manifest nodes contain compiled_code with fully-rendered SQL. Use it directly for CLL instead of re-rendering Jinja through dbt's engine, avoiding the most expensive step per node. Build table_id_map from parent node aliases (sqlglot strips db/schema qualifiers). Fall back to Jinja rendering if two parents share the same alias to avoid incorrect column mappings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: even-wei <evenwei@infuseai.io>
even-wei
left a comment
There was a problem hiding this comment.
Self-Review
Correctness ✅
- sqlglot qualifier stripping verified:
cll()always returns just the table name ind.noderegardless of qualification level ("db"."schema"."table"→table). Confirmed via direct testing. table_id_mapmapping is correct: Pre-compiled path mapsalias.lower() → unique_id, which matches what sqlglot returns ind.node. Schema dict keys also use the alias, matching sqlglot'squalify()expectations.- Alias collision fallback: When two parents share an alias,
table_id_mapis cleared andpre_compiledis set toNone, correctly falling through to the Jinja path with globally unique table names. lru_cachecompatibility: Cache key is(node_id, base)— compiled_code comes from the manifest which is stable during a session. No stale cache risk.
Edge Cases Covered ✅
| Scenario | Handling |
|---|---|
No compiled_code (default) |
Falls back to existing Jinja path — zero behavioral change |
| Custom alias on parent | Uses alias attr, tested in test_cll_with_compiled_code_custom_alias |
| Source parent | Uses identifier (or name), tested in test_cll_with_compiled_code_source |
| Parent not in nodes/sources | continue — silently skipped (matches existing behavior) |
| Alias collision (cross-project) | Detects and falls back to Jinja, tested in test_cll_with_compiled_code_alias_collision_falls_back |
compiled_code present but SQL parsing fails |
Caught by existing except RecceException/Exception → _apply_all_columns(node, "unknown") |
Potential Concerns
-
Test helper
_set_compiled_codepatchescompiled=Truealongsidecompiled_codebecause dbt'sCompiledResource.__mashumaro_post_serialize__stripscompiled_codefromto_dict()whencompiled=False. This is only needed in tests — production manifests loaded from JSON already havecompiled=True. -
Repeated alias resolution (3 occurrences of
getattr(parent_node, 'alias', None) or parent_node.name): Considered extracting a helper but it would be a premature abstraction for the current scope.
Verdict
No critical or important findings. Implementation is correct and well-tested. Ready for human review.
Codecov Report❌ Patch coverage is
... and 5 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Extract _get_parent_table_name and _build_schema_from_aliases as testable methods. Add direct unit tests covering model alias, source identifier, unknown parent (returns None), catalog=None, and mixed parent types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: even-wei <evenwei@infuseai.io>
wcchang1115
left a comment
There was a problem hiding this comment.
Tested with a Snowflake dbt project, and the pre-compiled path worked correctly.
Thanks!
Summary
compiled_code(fromdbt compile)table_id_mapfrom parent node aliases instead of customref/sourcefunctionsChanges
recce/adapter/dbt_adapter/__init__.py: Inget_cll_cached(), check forcompiled_codeon the manifest node before callinggenerate_sql(). If present, use it directly and buildtable_id_mapfrom parent aliases/identifiers. Detect alias collisions and fall back to Jinja path.tests/adapter/dbt_adapter/test_dbt_cll.py: 5 new tests — basic compiled_code, custom alias, source parent, multiple parents (join), alias collision fallback.Context
generate_sql()calls into dbt's Jinja engine (macro resolution, context setup, rendering) for every node during CLL. When a user has already rundbt compile, the manifest contains fully-rendered SQL — re-compiling is wasted work.sqlglot's
qualify()strips database/schema qualifiers, sod.nodeis always just the table name (= model alias). This meanstable_id_mapcan mapalias → unique_iddirectly from the manifest.Test plan
tests/util/test_cll.pytests pass (pure SQL parsing, unchanged)tests/adapter/dbt_adapter/test_dbt_cll.pytests pass (Jinja path unchanged)compiled_codein manifestSize
+269/-50 across 2 files
🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com