Skip to content

crypto: migrate reqwest from native-tls to rustls across workspace#35869

Draft
jasonhernandez wants to merge 23 commits intojason/sec-218-combined-tls-rustlsfrom
jason/sec-239-reqwest-rustls
Draft

crypto: migrate reqwest from native-tls to rustls across workspace#35869
jasonhernandez wants to merge 23 commits intojason/sec-218-combined-tls-rustlsfrom
jason/sec-239-reqwest-rustls

Conversation

@jasonhernandez
Copy link
Copy Markdown
Contributor

@jasonhernandez jasonhernandez commented Apr 4, 2026

Summary

  • Migrate ~20 reqwest deps to default-features = false + rustls-tls-webpki-roots-no-provider
  • Remove native-tls-vendored from ccsr, switch Identity to from_pem
  • Switch mysql_async from native-tls-tls to rustls-tls, update ClientIdentity API
  • Switch segment from native-tls-vendored to rustls-tls
  • Switch sentry from transport (native-tls) to reqwest (TLS via feature unification)
  • Remove unused postgres-openssl patch
  • Fix deny.toml ring ban wrappers (pre-existing issue)
  • Install aws-lc-rs CryptoProvider in build scripts (mz-npm, fivetran-destination)
  • Revert rdkafka from ssl-awslc back to ssl-vendored to fix duplicate SSL symbol linker errors (rdkafka's bundled BoringSSL conflicts with openssl-sys still in tree via tiberius/hyper-tls)

Remaining native-tls paths (require separate follow-ups)

  • tiberius fork — uses native-tls; upstream rustls 0.21 uses ring; needs fork update to rustls 0.23 + aws_lc_rs → SEC-248
  • hyper-tls via LaunchDarkly SDK — uses hyper 0.14; needs SDK upgrade to hyper 1.x + hyper-rustls → SEC-249
  • iceberg-rust fork — hardcodes reqwest/native-tls; needs fork update
  • azure_coreenable_reqwest_rustls activates reqwest/rustls-tls which bundles ring; needs upstream enable_reqwest_rustls_no_provider feature
  • duckdblibduckdb-sys build script needs TLS; kept native-tls (build-dep only)

Why rdkafka stays on ssl-vendored for now

rdkafka's ssl-awslc feature bundles BoringSSL (AWS-LC) statically, exporting SSL_new, SSL_CTX_ctrl, etc. These conflict with the same symbols from openssl-sys (brought in by tiberius and hyper-tls via native-tls). The rdkafka AWS-LC migration can only happen after SEC-248 and SEC-249 remove openssl-sys from the dependency tree entirely.

Chains off #35867. Part of SEC-239.

Test plan

  • cargo check --all-targets passes locally
  • cargo deny check bans passes
  • cargo deny check licenses passes
  • Full CI

🤖 Generated with Claude Code

jasonhernandez and others added 4 commits April 3, 2026 22:55
Migrate all reqwest dependencies across the workspace from the default
native-tls backend to rustls-tls, eliminating the native-tls/OpenSSL
dependency chain that caused duplicate SSL symbol errors with aws-lc-rs
when building with --all-features.

Changes per dependency:
- reqwest: Add default-features = false + rustls-tls-webpki-roots-no-provider
  to ~20 crates (build-deps use rustls-tls-webpki-roots for self-contained
  CryptoProvider)
- ccsr: Remove native-tls-vendored, switch Identity to use reqwest::Identity::from_pem
- hyper-tls → hyper-rustls 0.24: Migrate LaunchDarkly SDK connectors in
  adapter and dyncfg-launchdarkly to use HttpsConnectorBuilder
- tiberius: Switch native-tls → rustls feature in sql-server-util, storage,
  storage-types
- mysql_async: Switch native-tls-tls → rustls-tls in storage-types,
  update ClientIdentity API to provide cert_chain + priv_key separately
- segment: Switch native-tls-vendored → rustls-tls
- sentry: Replace transport (native-tls) with reqwest + rustls features
- azure_*: Set default-features = false, enable enable_reqwest_rustls
- duckdb: Remove native-tls feature
- Remove unused postgres-openssl patch

Remaining: iceberg-rust fork hardcodes reqwest/native-tls in its workspace
Cargo.toml — requires a separate fork update.

Part of SEC-239.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hyper-rustls 0.24 depends on rustls 0.21 + ring, which violates the
workspace ring ban and creates duplicate crate versions. Keep hyper-tls
for the LaunchDarkly SDK connectors (behind telemetry feature flag)
until the LD SDK is upgraded to hyper 1.x.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Revert tiberius native-tls → rustls: the MaterializeInc tiberius fork
  uses rustls 0.21 + ring, violating the workspace ring ban
- Fix mz-npm and fivetran-destination build deps: use
  rustls-tls-webpki-roots-no-provider and explicitly install aws-lc-rs
  CryptoProvider in build scripts
- Sentry: use just "reqwest" feature (TLS comes from workspace feature
  unification), not "rustls" which activates ring through reqwest

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ring ban's wrappers list was missing the intermediate crates
(rustls, rustls-webpki) between ring and the already-allowed aws-config.
This was a pre-existing issue on the parent branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

jasonhernandez and others added 10 commits April 4, 2026 11:59
- Revert azure_* crates to defaults: azure_core's enable_reqwest_rustls
  activates reqwest/rustls-tls which bundles ring, causing duplicate
  symbol errors with aws-lc-rs on Linux. azure_core needs an upstream
  enable_reqwest_rustls_no_provider feature.
- Restore duckdb native-tls feature: libduckdb-sys build script uses
  reqwest and needs a TLS backend; with resolver "2" build-dep features
  are separate from normal deps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
rdkafka's ssl-awslc feature bundles BoringSSL (AWS-LC), which exports
SSL_new, SSL_CTX_ctrl, etc. These conflict with openssl-sys symbols
still in the tree via tiberius (native-tls) and hyper-tls (LaunchDarkly
SDK), causing "duplicate symbol" linker errors in CI.

Revert rdkafka to ssl-vendored (OpenSSL) so both rdkafka and the
remaining native-tls paths use the same SSL library. The rdkafka AWS-LC
migration must wait until tiberius and the LD SDK are migrated off
native-tls, fully removing openssl-sys from the dependency tree.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch tiberius from the native-tls feature (openssl) to the rustls
feature using rustls 0.23 + aws-lc-rs. This removes tiberius as an
openssl-sys dependency path.

Uses personal fork (jasonhernandez/tiberius, branch
rustls-0.23-on-mz-changes) which cherry-picks the rustls 0.23 upgrade
onto the existing mz_changes commits (Row::build, money types,
trust_cert_ca_pem).

Remaining native-tls paths: hyper-tls (LaunchDarkly SDK), reqwest
dev/build-deps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When both `aws-lc-rs` and `ring` features are active (as in CI's
--all-features cargo-test-2 job), rustls cannot auto-detect which
CryptoProvider to use. Install aws-lc-rs as the default provider
early in binary entrypoints (environmentd, clusterd) and in tests
that use tokio-postgres-rustls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ness

Extend the CryptoProvider installation to testdrive, sqllogictest,
orchestratord, and the environmentd test harness. These all use TLS
(via reqwest, tokio-postgres-rustls, or server-core) and need the
provider installed before any TLS operation when both aws-lc-rs and
ring features are active.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add fips_crypto_provider() to persistcli main (fixes short-zippy).
Move CryptoProvider install in test harness from try_start to
try_start_with_trigger so all test paths are covered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tokio-postgres-rustls v0.13 hardcodes a dependency on ring. Replace it
with mz-tls-util::MakeRustlsConnect which uses the process-wide
CryptoProvider (aws-lc-rs). This removes one direct ring dependency
path.

Ring still enters via kube's hyper-rustls (feature unification), so
the CryptoProvider install_default() calls remain necessary for
--all-features builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add kube/aws-lc-rs feature to avoid ring via hyper-rustls
- Remove segment/rustls-tls feature (was enabling reqwest/rustls-tls
  which bundles ring; TLS comes from workspace feature unification)

With ring fully removed, rustls auto-detects aws-lc-rs as the sole
CryptoProvider, eliminating the need for explicit install_default()
calls. All CryptoProvider panics from --all-features builds are
resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Include Cargo.lock changes from ring elimination and remove ring
wrappers from deny.toml since ring is no longer in the dependency tree.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonhernandez jasonhernandez force-pushed the jason/sec-239-reqwest-rustls branch from 0dae9fd to 3998765 Compare April 6, 2026 06:01
jasonhernandez and others added 9 commits April 5, 2026 23:34
When tokio-postgres connects via Unix socket, it still calls
make_tls_connect with the socket path as the domain (e.g.
"/var/run/postgresql"). Since socket paths are not valid DNS names,
ServerName::try_from fails, preventing all Unix socket connections.

Use "localhost" as a placeholder ServerName for Unix socket paths,
since TLS is never actually negotiated over Unix sockets.

Fixes CI failures where environmentd could not connect to the internal
CockroachDB via Unix socket for consensus and timestamp oracle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
tokio-postgres passes an empty string as the domain to
make_tls_connect for Unix socket connections (Host::Unix maps to
hostname=None, which unwrap_or("") produces ""). The previous fix
only handled "/" prefixed paths but missed the empty string case,
so ServerName::try_from("") still failed with InvalidDnsNameError.

Handle both empty strings and "/" prefixed paths by using "localhost"
as a placeholder ServerName, since TLS is never negotiated over Unix
sockets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Update kafka-auth test expected error: with rustls, invalid PEM data
   produces "failed to parse private key PEM" instead of the old OpenSSL
   error "No supported data to decode".

2. Add [patch.crates-io] for mysql_async pointing to MaterializeInc fork
   with two rustls backend fixes:
   - ClientIdentity::load() uses private_key() for all key formats
     (PKCS#1, PKCS#8, SEC1), not just rsa_private_keys() (PKCS#1 only)
   - DangerousVerifier matches "not valid for name" (rustls 0.23 Display
     format) in addition to "NotValidForName" for skip_domain_validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With rustls, certificate validation failures produce "invalid peer
certificate: UnknownIssuer" instead of the native-tls "TLS error".
Update both mysql-cdc and mysql-cdc-old-syntax test variants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Identity::from_pem validated PEM data with rustls-pki-types but the
From<Identity> for reqwest::Identity conversion used expect() with
reqwest::Identity::from_pem, which may parse differently. If reqwest
couldn't parse the PEM, this would panic and crash the session.

Add reqwest validation at construction time so the expect() is safe.
This may fix the session crash seen in kafka-auth mTLS tests (SEC-258).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Now that tiberius, reqwest, and all other native-tls consumers have been
migrated to rustls, openssl-sys is no longer in the dependency tree.
This eliminates the duplicate symbol conflict that previously blocked
the ssl-awslc migration.

Switch rdkafka from ssl-vendored (vendored OpenSSL) to ssl-awslc
(pre-built AWS-LC at /opt/aws-lc in the CI builder image). This:
- Removes openssl and openssl-sys from the dependency tree entirely
- Eliminates dual crypto library cleanup conflicts (SIGABRT on exit)
- Aligns Kafka TLS with the rest of the stack on AWS-LC for FIPS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-LC"

native-tls is still in the dependency tree via iceberg-rust (reqwest
native-tls feature), LaunchDarkly SDK (hyper-tls), and duckdb. Without
ssl-vendored, there is no OpenSSL for native-tls to link against.

Revert to ssl-vendored until native-tls is fully eliminated from the
dependency tree. The SIGABRT from dual crypto libraries is intermittent
and less blocking than linker failures.

This reverts commit fdbd336.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant