Skip to content

[FLINK-39567][paimon] Add blob support.#4391

Open
lvyanquan wants to merge 2 commits intoapache:masterfrom
lvyanquan:FLINK-39567
Open

[FLINK-39567][paimon] Add blob support.#4391
lvyanquan wants to merge 2 commits intoapache:masterfrom
lvyanquan:FLINK-39567

Conversation

@lvyanquan
Copy link
Copy Markdown
Contributor

Summary

This commit adds BLOB field support to the Flink CDC Paimon connector, enabling efficient storage and handling of large binary data during CDC synchronization operations.

Key Changes

  1. New BlobWriteContext Component
  • Introduced BlobWriteContext class to handle BLOB fields during CDC write operations
  • Supports two blob storage modes:
    • Mode 1 (raw data): VARBINARY/BINARY fields → BlobData → written to .blob files
    • Mode 2 (descriptor): VARCHAR/STRING fields → BlobRef → only descriptor (uri, offset, length) stored inline
  • Integrates with Paimon's CoreOptions for blob configuration
  1. Schema Evolution Support
  • Enhanced SchemaChangeProvider to automatically convert VARBINARY/BINARY/VARCHAR/STRING types to BLOB type based on blob-field configuration
  • Updated updateColumnType method to handle BLOB type conversion during schema changes
  • Added validation to prevent altering primary key or partition key columns to BLOB type
  1. Writer Integration
  • Modified PaimonWriterHelper to support blob field handling
  • Updated PaimonRecordEventSerializer for BLOB data serialization
  • Enhanced TableSchemaInfo to track blob field metadata
  1. Comprehensive Testing
  • Added PaimonMetadataApplierTest with 468 lines of test coverage
  • Added PaimonWriterHelperTest for blob write scenarios
  • Added AppendOnlyTableITCase integration tests with test fixtures

Configuration Example

Enable blob fields via table options

blob-field = content, image_data
blob-descriptor-field = external_file_path

Enable blob fields via table options

blob-field = content, image_data
blob-descriptor-field = external_file_path

JIRA Reference

https://issues.apache.org/jira/browse/FLINK-39567

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant