Describe the bug
We have two Problems when we want to apply Constraints (Primary Key + Foreign Keys) on our models with the dbt-databricks Adapter.
1. Multi-Column Primary Key
When we are defiing multi-column primary Constraint based on the dbt Documentation the compiled SQL doesnt contain a Statement for the Primary Key Generation. Only Primary Keys where we are defining a single column PK are included in the SQL.
Example SQL (based on the YAML-Definition below) where we applied a Multi-Column PK and a Foreign Key:
/* {"app": "dbt", "dbt_version": "1.11.6", "dbt_databricks_version": "1.11.6", "databricks_sql_connector_version": "4.1.3", "profile_name": "project", "target_name": "dev-local", "node_id": "model.project.table_one"} */
create or replace table `catalog`.`schema`.`table_one` (
`hashkey` binary
NOT NULL
COMMENT 'Hashkey',
`record_source` string COMMENT 'Record source',
`load_timestamp` timestamp NOT NULL COMMENT 'Load-Timestamp',
FOREIGN KEY (hashkey) REFERENCES `catalog`.`schema`.`table_two` (
hashkey
)
)
using delta
CLUSTER BY AUTO
comment 'Table One'
As you can see the Table only creates the FK and a follow-up ALTER-Table Statement is not executed in the Query History
2. Primary Keys and Foreign Keys are removed on Re-Run the Model
When we are re-run models, with existing Constraints (PK,FK), after a successful run all Models loosing their Constraints due to the following Query:
ALTER TABLE
`catalog`.`schema`.`table_one`
DROP CONSTRAINT IF EXISTS
table_one_fk
No Follow-Up Query is executed which re-applies the Constraints:
Steps To Reproduce
- Our Models are materialized as Incremental models
- In our case we want to apply a Multi-Column PK to the Columns Hashkey + Load_timestamp and in parallely Hashkey is also a Foreign Key of table_two
dbt-project.yml:
...
flags:
# set default adapter options for running this project
use_materialization_v2: true
warn_error_options:
silence: [ConstraintNotEnforced]
...
models:
+auto_liquid_cluster: true
+persist_docs:
relation: true
columns: true
...
Create a Model with the following YAML-Definition (anonymized):
- Preconsumption: table_two exists with Primary Key.
models:
- name: table_one
description: "Table one data"
constraints:
- type: primary_key
columns: ["hashkey", "load_timestamp"]
warn_unsupported: False
columns:
- name: hashkey
description: "Hashkey of Table"
constraints:
- type: not_null
- type: foreign_key
to: ref('table_two')
to_columns: [hashkey]
- name: load_timestamp
description: "Load-Timestamp"
constraints:
- type: not_null
- name: record_source
description: "Record Source"
System information
The output of dbt --version:
Core:
- installed: 1.11.6
- latest: 1.11.7 - Update available!
Plugins:
- databricks: 1.11.6 - Up to date!
The operating system you're using:
Windows
The output of python --version:
Python 3.12.3
Describe the bug
We have two Problems when we want to apply Constraints (Primary Key + Foreign Keys) on our models with the dbt-databricks Adapter.
1. Multi-Column Primary Key
When we are defiing multi-column primary Constraint based on the dbt Documentation the compiled SQL doesnt contain a Statement for the Primary Key Generation. Only Primary Keys where we are defining a single column PK are included in the SQL.
Example SQL (based on the YAML-Definition below) where we applied a Multi-Column PK and a Foreign Key:
As you can see the Table only creates the FK and a follow-up ALTER-Table Statement is not executed in the Query History
2. Primary Keys and Foreign Keys are removed on Re-Run the Model
When we are re-run models, with existing Constraints (PK,FK), after a successful run all Models loosing their Constraints due to the following Query:
No Follow-Up Query is executed which re-applies the Constraints:
Steps To Reproduce
dbt-project.yml:
Create a Model with the following YAML-Definition (anonymized):
System information
The output of
dbt --version:Core:
Plugins:
The operating system you're using:
Windows
The output of
python --version:Python 3.12.3