-
Notifications
You must be signed in to change notification settings - Fork 132
fix: set explicit incremental_strategy='append' for test_result_rows.sql #918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
dbad7cd
fix: remove incremental_strategy from test_result_rows.sql
devin-ai-integration[bot] 5820bcb
ci: update Python version to 3.10 for tests
devin-ai-integration[bot] 403067b
fix: explicitly set incremental_strategy='append' for test_result_row…
devin-ai-integration[bot] 0530b56
fix: add get_append_only_incremental_strategy macro for append-only t…
devin-ai-integration[bot] 9cd962a
fix: update get_append_only_incremental_strategy to return append for…
devin-ai-integration[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: elementary-data/dbt-data-reliability
Length of output: 1724
🏁 Script executed:
# Also check the context of incremental_strategy.sql to understand the broader pattern cat -n macros/utils/cross_db_utils/incremental_strategy.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 1203
🏁 Script executed:
Repository: elementary-data/dbt-data-reliability
Length of output: 3335
🏁 Script executed:
# Check test_result_rows.sql for unique_key configuration cat -n models/edr/run_results/test_result_rows.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 1177
🏁 Script executed:
Repository: elementary-data/dbt-data-reliability
Length of output: 144
🌐 Web query:
dbt incremental_strategy none default behavior merge fallback💡 Result:
incremental_strategy(i.e., it’s effectivelyNone), dbt calls the adapter’s default incremental SQL macro (get_incremental_default_sql). The actual behavior is adapter-specific (it is not universallymerge). [1], [2]merge(for example: Snowflake and BigQuery). [3], [4]unique_keyrules:merge/delete+insert/microbatchbut omitunique_key, dbt generally falls back toappend. [5]mergeon BigQuery requiresunique_key; leaving it out can be ambiguous or fail rather than cleanly falling back. [5]So:
incremental_strategy: None⇒ “use the adapter default,” which may bemergeon some platforms, but not all; and separately, missingunique_keycan force/trigger anappendfallback (with BigQuery caveats). [3], [4], [5]Sources:
[1] dbt-labs/docs.getdbt.com#1761
[2] https://docs.getdbt.com/docs/build/incremental-strategy
[3] https://docs.getdbt.com/reference/resource-configs/snowflake-configs
[4] https://docs.getdbt.com/reference/resource-configs/bigquery-configs
[5] https://docs.getdbt.com/reference/resource-configs/unique_key
🏁 Script executed:
Repository: elementary-data/dbt-data-reliability
Length of output: 1177
🌐 Web query:
dbt incremental_strategy append vs merge difference unique_key requirement💡 Result:
appendvsmergein dbt incremental modelsappendINSERTof the incremental query results into the target table; it does not update or delete existing rows.unique_key: Not required (and not used for upserts). If you don’t setunique_key, dbt/your adapter effectively uses append behavior by default. [1]mergeunique_key(SCD Type 1–like overwrite semantics). [2]unique_key,mergehelps prevent duplicates by updating matched keys instead of inserting again. [2]unique_key: Required for true merge/upsert behavior (it’s the match key). If you configuremergewithoutunique_key, it behaves likeappend(adapter fallback). [1], [2]incremental_strategy='merge', you must provideunique_key(otherwise behavior is ambiguous or can fail). [1]Rule of thumb
appendwhen rows are immutable (event/log-style) and duplicates are acceptable/handled downstream. [2]mergewhen you need the target table to reflect the latest version per key (upserts). [2]Sources: dbt docs on
unique_keyand incremental strategies. [1], [2]Default macro returns
none, causing non-Athena adapters to rely on adapter-specific defaults without explicit intent.test_result_rows.sql(the only consumer of this macro) lacks aunique_keyand is designed for append-only behavior. Returningnonedefers to adapter defaults—which vary (Snowflake and BigQuery typically default tomerge)—rather than explicitly stating the append-only strategy. This creates implicit coupling to adapter behavior and ambiguity about incremental semantics.Return
"append"explicitly to match the macro's name and intent across all adapters:🔧 Suggested fix
{% macro default__get_append_only_incremental_strategy() %} - {% do return(none) %} + {% do return("append") %} {% endmacro %}📝 Committable suggestion
🤖 Prompt for AI Agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually fine with this, as in practice we update this table via post-hook / on-run-end hook, and the model query itself is always empty.
I just need it not to fail - which is why I have the setting for Athena.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! The macro approach ensures Athena gets the explicit
appendstrategy it needs while other adapters use their defaults. Ready to merge whenever you'd like - just note that the CI will continue to fail until merged due to the Python 3.9 → 3.10 chicken-and-egg issue with the reusable workflow.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually I have doubts, maybe I'll do as you say
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haritamarThanks for reconsidering! I think returning"append"explicitly is the safer choice here, even though the model query is currently empty. Here's why:Benefits of explicit
"append":get_append_only_incremental_strategy, so returning"append"matches its documented intentmerge, others may differ)Current risk with
none:mergewhen strategy is unspecifiedappendwhenunique_keyis missing, this fallback behavior isn't guaranteed across all adapter versions and can be ambiguousThe suggested fix in my previous comment explicitly returns
"append"for the default implementation, which aligns the behavior across all adapters with the macro's intended purpose. Let me know if you'd like to proceed with that change!🧠 Learnings used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the macro, but:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Updated the macro:
'append'for all adapters'merge'(since BigQuery doesn't support append)Pushed commit 9cd962a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
💡 Result:
Yes—BigQuery supports dbt’s
appendincremental strategy.appendis the default incremental behavior when you don’t specify anincremental_strategy(and typically when you don’t setunique_key). dbt will simplyINSERT INTOthe target table. This does not update/delete existing rows and can create duplicates. [1], [2]{{ config(materialized='incremental', incremental_strategy='append') }} select ... {% if is_incremental() %} where ... {% endif %}merge(requiresunique_key), or partition-replacement strategies likeinsert_overwrite/microbatch(partition-based,unique_keynot required). [2], [3]Sources:
[1] dbt docs: Incremental strategies (
append) https://docs.getdbt.com/docs/build/incremental-strategy[2] dbt docs:
unique_keydefaulting behavior and relationship toappendhttps://docs.getdbt.com/reference/resource-configs/unique_key[3] dbt-bigquery incremental strategy behavior overview (adapter implementation summary) https://deepwiki.com/dbt-labs/dbt-bigquery/3.2-incremental-materialization-strategies
@haritamar, I checked the dbt-bigquery documentation and found that BigQuery does support theappendincremental strategy—it's actually the default behavior when no strategy is specified.However, I want to clarify your requirements before implementing:
Potential concern: Setting
incremental_strategy='merge'for BigQuery typically requires aunique_key, whichtest_result_rows.sqldoesn't have. Without aunique_key, the behavior may fall back toappendanyway, or could cause unexpected behavior.Questions:
appendon BigQuery for this model?If you still want to proceed with
mergefor BigQuery (perhaps for consistency with other BigQuery models or to match your conventions), I'm happy to implement it. Just want to make sure we're solving the right problem!Would you like me to:
"append"for default,"merge"for BigQuery"append"uniformly across all adapters (including BigQuery)🧠 Learnings used