feat: cherry-pick upstream provider improvements#4
Open
BorisTyshkevich wants to merge 36 commits intoaltinityfrom
Open
feat: cherry-pick upstream provider improvements#4BorisTyshkevich wants to merge 36 commits intoaltinityfrom
BorisTyshkevich wants to merge 36 commits intoaltinityfrom
Conversation
(cherry picked from commit 8c1c78cd60d2d6c50a8c04d523260eaf30a15161)
…e execution semantics
This commit consolidates the CH-only cleanup stream into a coherent test-system migration and stability pass. It normalizes suite topology around the new layered model, restores matrix tooling, and hardens local test execution so failures reflect real regressions instead of runner/logging artifacts.
What changed
- Rebuilt test layout around authoritative layers:
- core: tests/e2e-core/{pg2ch,mysql2ch,mongo2ch}
- optional: tests/e2e-optional/{kafka2ch,eventhub2ch,kinesis2ch,airbyte2ch,oracle2ch,ch2ch}
- supporting layers: tests/evolution, tests/resume, tests/large
- Restored and aligned matrix/test orchestration assets:
- tests/e2e-core/matrix/{cdc_local_suite.yaml,cdc_optional_suite.yaml,core2ch.yaml,sources.yaml,README.md}
- Makefile targets for wave-based execution, optional gates, cache controls, and strict rerun behavior.
- Reintroduced/normalized suite content under e2e-core/e2e-optional and supporting layers (including ch2ch + stdout/dev scope constraints already discussed in branch direction).
- Refreshed helper surface used by the new system:
- tests/helpers/coordinator_backend.go and related helper updates
- compare storage and metering helper additions
- Canon and runner stability fixes:
- removed noisy raw JSON stdout from canon validator sink close path
- adjusted test worker logger usage to avoid forced debug churn in execution paths
- updated logger behavior to respect explicit LOG_LEVEL in gotest context
- Makefile now supports GOTESTSUM_FORMAT with sensible local/CI behavior
- test targets export LOG_LEVEL/ YT_LOG_LEVEL for deterministic output
- Updated docs to reflect the new test model and commands:
- tests/README.md
- layer-specific README files across core/optional/evolution/resume/large
Why
- The previous state mixed legacy and new orchestration, causing duplicated intent, brittle execution order, and hard-to-diagnose failures.
- Canon and gotestsum output parsing produced synthetic failures in noisy runs despite successful package exits.
- The new structure makes retained provider scope explicit, keeps optional lanes separate, and improves maintainability and CI signal quality.
Validation performed
- make test-cdc-full FORCE=1 RERUN_FAILS=0 -> PASS
- make test-cdc-optional FORCE=1 RERUN_FAILS=0 -> PASS
- targeted canon repro/verification for postgres after runner noise fix -> PASS
- govulncheck run (post-fix environment check):
- reachable code vulnerabilities: 0
- module-level-only findings remained in aws-sdk-go and are not called from current code paths
Notes
- This commit intentionally captures the current branch state to preserve momentum in the cleanup stream.
- vendor_patched remains untouched as requested.
- Optional smoke suites that are intentionally blocked (eventhub/airbyte/oracle local smoke wiring) remain skipped by design and documented.
Hardened CI/local parity for the post-cleanup matrix by addressing ClickHouse 25.12 behavior changes and auth propagation gaps that surfaced under full forced runs (core + optional waves). Key fixes included in this commit: - clickhouse error classification: recognize cluster-not-found code 701 in distributed DDL fallback checks, with dedicated unit coverage. - clickhouse recipe env propagation: forward prefixed RECIPE_CLICKHOUSE_PASSWORD to avoid empty-password auth failures in prefixed test recipes. - e2e credential wiring: set ClickHouse password in manual ChSource constructions used by mongo2ch snapshot_flatten and kinesis2ch replication checks. - pg2ch replication assertions: adapt replication/replication_ts checks to ClickHouse 25.12 semantics (FINAL CLEANUP table setting and row convergence behavior). - testcontainer startup resilience: increase ClickHouse startup timeout in recipe waits to reduce transient readiness flakiness under matrix load. - workflow stability guardrail: force serial package execution for the clickhouse provider package in generic CI test matrices to avoid container/reaper contention. Validation executed locally: - make build - make test - make test-cdc-full FORCE=1 - make test-cdc-optional FORCE=1 - go generate ./... - golangci-lint (new-from-rev) - govulncheck ./...
Introduce reusable CI and stream-specific callers for altinity (prod) and dev (integration), with optional e2e execution gated by CI_RUN_OPTIONAL repo variable and workflow_dispatch override. Add dedicated dev Docker publish workflow to GHCR and optional manual promotion workflow to retag vetted dev digests into DockerHub prod tags.
Use golang:1.24.13-alpine3.22 in Dockerfile so GHCR dev image builds succeed with go.mod go 1.24.13 requirement.
Use linux/amd64-only publishing for dev stream and add job timeout to avoid prolonged multi-arch hangs while keeping prod DockerHub workflow unchanged.
… references - Remove YDB debezium emitter/receiver and tests - Remove YTSaurus logging, KV wrapper, and recipe helpers - Remove Greenplum and OpenSearch connection code - Remove S3 example (s3sqs2ch) and docs references - Remove Elasticsearch and Delta docs references - Clean up error codes for removed providers - Update .mapping.json to remove deleted file entries - Add .claude/ and reports/ to .gitignore Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Enable linux/amd64 and linux/arm64 builds for dev image stream and disable provenance/sbom emission to avoid unknown/unknown attestation manifests in GHCR UI.
Cherry-picked from transferia/main: 72d663f Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: 6aaf95e Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: c7f81d6 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: b67ceeb Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…s_default) Cherry-picked from transferia/main: a86ec0f Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: 8cc6675 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: f5dc0fe Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…maMigrationDisabled) Cherry-picked from transferia/main: 3432ce4 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: fbd3058 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: 427dbf6 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: dc6a632 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: 2084631 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: 51b23c5 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: 7f53a0a Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: 6035694 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cherry-picked from transferia/main: a5d7034 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds buffer pooling, consolidates batch serializers, fixes JSON escaping, fixes parquet file structure. Cherry-picked from transferia/main: 5c3f2ed Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Splits the monolithic SampleableStorage interface into focused interfaces: - SizeableStorage for TableSizeInBytes - Sampleable for LoadRandomSample - AccessCheckable for TableAccessible - ChecksumableStorage for full checksum methods Cherry-picked from transferia/main: 211782b Includes fixes for API compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add explicit permissions blocks to CI workflows to limit GITHUB_TOKEN scope - Add bounds checking for strconv.Atoi to int32/int8 conversions in pglogrepl Fixes: - Workflow permissions: ci-dev.yml, ci-prod.yml, reusable-ci.yml - Integer conversion bounds: pglogrepl.go lines 449, 496, 504, 692 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add timeout-minutes: 30 to e2e-tests and generic-tests jobs - Limit gotestsum --rerun-fails to 2 retries (was unlimited) - Prevents infinite retry loops when testcontainers fail to start Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- ci-prod.yml and ci-dev.yml need contents:read to call reusable-ci.yml - Add timeout-minutes: 30 to generic-tests, e2e-core, e2e-optional jobs - Limit gotestsum retries to 2 in reusable-ci.yml Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Merge tests/e2e-core/ and tests/e2e-optional/ into single tests/e2e/ - Delete non-aligned tests (kafka2kafka, mysql2kafka, pg2pg) - Remove duplicate kafka2ch from e2e-optional - Update Makefile: LAYER=e2e-core -> LAYER=e2e - Update CI workflows to use tests/e2e paths - Update matrix YAML files (cdc_local_suite, cdc_optional_suite, core2ch, sources) - Fix MV column alias mismatch in kafka2ch/replication_mv - Switch from external Zookeeper to ClickHouse built-in Keeper - Disable optional e2e tests by default in CI - Update AGENTS.md and tests/README.md documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add ChSinkMigrationOptions struct and MigrationOptions field to ChDestination to support automatic column addition during schema migration. This fixes the CI build failure in evolution tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Align storage test directory naming with canon tests (both now use "postgres" instead of "pg") so Makefile test-layer command works correctly for storage tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Cherry-picks 18 high-priority commits from
transferia/transferia:mainrelated to active providers (ClickHouse, PostgreSQL, MySQL, Kafka) plus infrastructure improvements.Bug Fixes
Feature Improvements
PostgreSQL Improvements
Kafka Improvements
Infrastructure & Observability
Infrastructure Refactors
Test plan
go build ./...)🤖 Generated with Claude Code