Course
data-engineering-zoomcamp
Question
Why does the PyFlink session window job fail with the error “please declare primary key for sink table when query contains update/delete record”?
Answer
Session window aggregations produce updates while the session is still open. The JDBC sink needs a primary key so it knows which row should be updated in the table. Without a primary key, Flink cannot apply the updates and the job fails. Define a primary key in the sink table using the window boundaries and the grouping key (for example window_start, window_end, and PULocationID).
Checklist
Course
data-engineering-zoomcamp
Question
Why does the PyFlink session window job fail with the error “please declare primary key for sink table when query contains update/delete record”?
Answer
Session window aggregations produce updates while the session is still open. The JDBC sink needs a primary key so it knows which row should be updated in the table. Without a primary key, Flink cannot apply the updates and the job fails. Define a primary key in the sink table using the window boundaries and the grouping key (for example window_start, window_end, and PULocationID).
Checklist