Skip to content

feat(data_privacy)!: introduce data_privacy_core #427

Open
Vaiz wants to merge 1 commit into
mainfrom
issue-371-split-data-privacy
Open

feat(data_privacy)!: introduce data_privacy_core #427
Vaiz wants to merge 1 commit into
mainfrom
issue-371-split-data-privacy

Conversation

@Vaiz
Copy link
Copy Markdown
Contributor

@Vaiz Vaiz commented May 18, 2026

Breaking changes in data_privacy crate can lead to cascading breaking changes across all consumer crates. This PR decouples redaction engines from basic data_privacy types so we can have more stable crate for classification, and have more freedom when we do breaking changes in redaction itself.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

⚠️ Breaking Changes Detected

error: failed to retrieve local crate data from git revision

Caused by:
    0: failed to retrieve manifest file from git revision source
    1: possibly due to errors: [
         failed when reading /home/runner/work/oxidizer/oxidizer/target/semver-checks/git-origin_main/372167fc8c6a5f23d81f716e0394d6162653999d/scripts/crate-template/Cargo.toml: TOML parse error at line 9, column 26
         |
       9 | keywords = ["oxidizer", {{CRATE_KEYWORDS}}]
         |                          ^
       missing key for inline table element, expected key
       : TOML parse error at line 9, column 26
         |
       9 | keywords = ["oxidizer", {{CRATE_KEYWORDS}}]
         |                          ^
       missing key for inline table element, expected key
       ,
         failed to parse /home/runner/work/oxidizer/oxidizer/target/semver-checks/git-origin_main/372167fc8c6a5f23d81f716e0394d6162653999d/Cargo.toml: no `package` table,
       ]
    2: package `data_privacy_core` not found in /home/runner/work/oxidizer/oxidizer/target/semver-checks/git-origin_main/372167fc8c6a5f23d81f716e0394d6162653999d

Stack backtrace:
   0: anyhow::error::<impl anyhow::Error>::msg
   1: cargo_semver_checks::rustdoc_gen::RustdocFromProjectRoot::get_crate_source
   2: cargo_semver_checks::rustdoc_gen::StatefulRustdocGenerator<cargo_semver_checks::rustdoc_gen::CoupledState>::prepare_generator
   3: cargo_semver_checks::Check::check_release::{{closure}}
   4: cargo_semver_checks::Check::check_release
   5: cargo_semver_checks::exit_on_error
   6: cargo_semver_checks::main
   7: std::sys::backtrace::__rust_begin_short_backtrace
   8: main
   9: <unknown>
  10: __libc_start_main
  11: _start

If the breaking changes are intentional then everything is fine - this message is merely informative.

Remember to apply a version number bump with the correct severity when publishing a version with breaking changes (1.x.x -> 2.x.x or 0.1.x -> 0.2.x).

@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 93.87755% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.9%. Comparing base (b5900ac) to head (cba397a).

Files with missing lines Patch % Lines
crates/data_privacy_macros/src/lib.rs 71.8% 9 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main    #427     +/-   ##
========================================
- Coverage   100.0%   99.9%   -0.1%     
========================================
  Files         286     286             
  Lines       22978   22853    -125     
========================================
- Hits        22978   22841    -137     
- Misses          0      12     +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Vaiz Vaiz force-pushed the issue-371-split-data-privacy branch 8 times, most recently from 52a562a to 1cf8061 Compare May 21, 2026 13:24
@Vaiz Vaiz marked this pull request as ready for review May 21, 2026 13:46
Copilot AI review requested due to automatic review settings May 21, 2026 13:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new data_privacy_core crate that owns the core classification/redaction traits and types, and refactors the existing data_privacy crate to re-export those core APIs while keeping RedactionEngine and built-in redaction strategies. The change also updates dependent crates/macros to use the new Redactor-based interfaces and adjusts CI/scripts to include the new crate.

Changes:

  • Add crates/data_privacy_core and move core types/traits (e.g., Sensitive, DataClass, Redactor, redacted formatting traits) into it.
  • Refactor data_privacy to re-export data_privacy_core, implement Redactor for RedactionEngine, and update redactor implementations/tests accordingly.
  • Update macros and templated_uri integration to use Redactor (&dyn Redactor) rather than RedactionEngine in redacted formatting, and update snapshots/metadata.

Reviewed changes

Copilot reviewed 63 out of 64 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
scripts/run-examples.rs Comment-only adjustment in excluded examples list.
scripts/publish-gh-release.rs Comment-only adjustment in changelog parsing logic.
scripts/mutants.rs Add data_privacy_core to mutation test grouping.
justfiles/basic.just Comment-only adjustment in examples recipe docs.
crates/templated_uri/src/uri.rs Switch redacted formatting to accept &dyn Redactor; adjust tests.
crates/templated_uri/src/path_and_query.rs Switch redacted formatting to accept &dyn Redactor.
crates/templated_uri/Cargo.toml Allow data_privacy_core::* types for external-types checking.
crates/templated_uri_macros_impl/src/struct_template.rs Update generated RedactedDisplay signature to take &dyn Redactor.
crates/templated_uri_macros_impl/src/enum_template.rs Update generated RedactedDisplay signature to take &dyn Redactor.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__templated_uri_impl.snap Snapshot updates for new Redactor signature.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__templated_unredacted_uri_impl.snap Snapshot updates for new Redactor signature.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__template_enum_impl.snap Snapshot updates for new Redactor signature.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__query_param_is_kv_expansion.snap Snapshot updates for new Redactor signature.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__optional_reference_field_codegen.snap Snapshot updates for new Redactor signature.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__optional_field_with_unredacted_codegen.snap Snapshot updates for new Redactor signature.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__optional_field_codegen.snap Snapshot updates for new Redactor signature.
crates/templated_uri_macros_impl/src/snapshots/templated_uri_macros_impl__tests__field_level_unredacted.snap Snapshot updates for new Redactor signature.
crates/data_privacy/tests/xxh3_redactor.rs Add coverage for new would_redact behavior.
crates/data_privacy/tests/rapidhash_redactor.rs Add coverage for new would_redact behavior.
crates/data_privacy/tests/redaction_engine.rs Update to new Redactor-based redact call signature.
crates/data_privacy/tests/sensitive.rs Add additional regression tests for redacted formatting and boundary behavior (with insta).
crates/data_privacy/src/sensitive.rs Remove old Sensitive implementation (moved to core).
crates/data_privacy/src/redactors/xxh3_redactor.rs Implement new would_redact method; adjust test coverage attrs.
crates/data_privacy/src/redactors/rapidhash_redactor.rs Implement new would_redact method.
crates/data_privacy/src/redactors/simple_redactor.rs Implement new would_redact method; import cleanup.
crates/data_privacy/src/redactors/redactor.rs Remove old Redactor trait (moved to core).
crates/data_privacy/src/redactors/mod.rs Remove local Redactor re-export; adjust tests for new trait contract.
crates/data_privacy/src/redaction_engine.rs Make RedactionEngine implement Redactor; remove inherent redact/would_redact.
crates/data_privacy/src/redaction_engine_inner.rs Update would_redact to delegate to the selected redactor/fallback; add tests.
crates/data_privacy/src/macros.rs Remove local macro re-export module (moved to core re-exports).
crates/data_privacy/src/lib.rs Re-export data_privacy_core APIs; update crate docs and module exports.
crates/data_privacy/README.md Regenerated README reflecting the new data_privacy_core split and links.
crates/data_privacy/Cargo.toml Bump version, add data_privacy_core, wire serde feature through, add insta dev-dep.
crates/data_privacy_macros/Cargo.toml Version bump aligned with macro changes.
crates/data_privacy_macros_impl/tests/snapshots/taxonomy__success.snap Snapshot updates to reference data_privacy_core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_display_unit.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_display_single.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_display_multiple.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_display_multiple_named.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_display_enum.snap Snapshot metadata update.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_debug_unit.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_debug_single.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_debug_multiple.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_debug_multiple_named.snap Snapshot updates for &dyn Redactor and core paths.
crates/data_privacy_macros_impl/tests/snapshots/derive__redacted_debug_enum.snap Snapshot metadata update.
crates/data_privacy_macros_impl/tests/snapshots/classified__success.snap Snapshot updates for core paths and &dyn Redactor usage.
crates/data_privacy_macros_impl/tests/snapshots/classified__success_named_field.snap Snapshot updates for core paths and &dyn Redactor usage.
crates/data_privacy_macros_impl/src/taxonomy.rs Use crate-path resolver so generated code targets data_privacy or data_privacy_core.
crates/data_privacy_macros_impl/src/lib.rs Export new crate_path helper module.
crates/data_privacy_macros_impl/src/derive.rs Update derive output to use resolved crate path and &dyn Redactor.
crates/data_privacy_macros_impl/src/crate_path.rs New helper to resolve whether to generate ::data_privacy or ::data_privacy_core paths.
crates/data_privacy_macros_impl/src/classified.rs Update classified macro output to use resolved crate path and &dyn Redactor.
crates/data_privacy_macros_impl/Cargo.toml Add proc-macro-crate dependency; version bump.
crates/data_privacy_core/src/lib.rs New core crate root exposing core traits/types and macro re-exports.
crates/data_privacy_core/src/macros.rs New macro re-export module for core crate.
crates/data_privacy_core/src/redactor.rs New Redactor trait definition (now in core).
crates/data_privacy_core/src/redacted.rs Update redacted formatting traits to accept &dyn Redactor; add tests.
crates/data_privacy_core/src/sensitive.rs New Sensitive implementation moved into core; add tests.
crates/data_privacy_core/src/data_class.rs Restructure tests; separate serde-gated tests into a dedicated module.
crates/data_privacy_core/src/classified.rs Update docs and minor trait doc tweaks for core crate.
crates/data_privacy_core/README.md New crate README describing core responsibilities.
crates/data_privacy_core/Cargo.toml New crate manifest with serde feature and dev-deps.
Cargo.toml Add data_privacy_core and proc-macro-crate to workspace dependencies; bump versions.
Cargo.lock Lockfile updates for new crate and new dependency versions.
Comments suppressed due to low confidence (1)

crates/data_privacy_core/src/redacted.rs:106

  • These tests use inline insta snapshots (e.g. assert_snapshot!(result, @"...")). To align with the repository’s usual insta usage (file-backed .snap artifacts), consider converting these to external snapshots or simple assert_eq! (the expected output is short).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/data_privacy_core/src/redactor.rs Outdated
Comment thread crates/data_privacy/src/lib.rs Outdated
Comment thread crates/data_privacy/src/redaction_engine.rs
Comment thread crates/data_privacy/tests/sensitive.rs
Comment thread crates/data_privacy_core/src/sensitive.rs
Copilot AI review requested due to automatic review settings May 21, 2026 14:28
@Vaiz Vaiz force-pushed the issue-371-split-data-privacy branch from 0b9f929 to 4582bc7 Compare May 21, 2026 14:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 63 out of 64 changed files in this pull request and generated 4 comments.

Comment thread crates/data_privacy/src/redactors/mod.rs
Comment thread crates/data_privacy/tests/redaction_engine.rs
Comment thread crates/templated_uri_macros_impl/src/struct_template.rs
Comment thread crates/templated_uri_macros_impl/src/enum_template.rs
Comment thread crates/data_privacy/src/lib.rs Outdated
mod redacted;
// Re-export everything from data_privacy_core.
pub use data_privacy_core::{
Classified, DataClass, IntoDataClass, RedactedDebug, RedactedDisplay, RedactedToString, Redactor, Sensitive, classified, taxonomy,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't preserve the macro documentation.

Image

#[must_use]
#[cfg_attr(coverage_nightly, coverage(off))]
pub fn resolve() -> TokenStream {
use proc_macro_crate::{FoundCrate, crate_name};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how I feel about this over just saying data_privacy is the way to go. It's a large crate, but it doesn't help lowering our dependency count, adding 8 more or so to our list (7 of them uniquely for this) for a use case we could say we just don't support.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually expect people to use data_privacy_core crate in libraries if they define reusable types. for example, a library defines TenantId, it doesn't need to have a dependency on full data_privacy crate

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like in the oxidizer case this would be an antipattern, at least when it comes to trying to make proc macros work. Random example, serde:

In crates that derive an implementation of Serialize or Deserialize, you must depend on the serde crate, not serde_core.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe crate names are not the best,
what if we change data_privacy_core -> data_privacy and data_privacy -> data_privacy_redactors ?

in that case, we can avoid re-exporting macros all together

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a possible conversation, but that seems to conflict with the premise of this PR. I'm all for creating a _core crate of stable items to avoid breakage. In this example though you seem to imply that everything in data_privacy then is stable except redactors?

Now that I think about it, I think what would improve this PR is actually moving out the macros of _core. These are mainly crate-specific, and we can happily rename macro attributes or change behavior over crate versions, without breaking overall ecosystem composability.

@Vaiz Vaiz force-pushed the issue-371-split-data-privacy branch from 4582bc7 to 0525c8f Compare May 22, 2026 12:54
Copilot AI review requested due to automatic review settings May 22, 2026 13:54
@Vaiz Vaiz force-pushed the issue-371-split-data-privacy branch from 0525c8f to cba397a Compare May 22, 2026 13:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 70 out of 71 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

crates/data_privacy_core/src/redacted.rs:106

  • These unit tests use inline insta snapshots (e.g., @"..."). In this repo, snapshot tests are commonly file-backed via .snap files (for example crates/templated_uri_macros_impl/src/lib.rs:199-208). Consider converting to external snapshots to reduce noise in code diffs.

Comment on lines 173 to 176
impl ::data_privacy::RedactedDisplay for #ident {
fn fmt(&self, engine: &::data_privacy::RedactionEngine, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
fn fmt(&self, engine: &dyn ::data_privacy::Redactor, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
#redacted_display
}
Comment on lines 82 to 86
impl ::data_privacy::RedactedDisplay for #ident {
fn fmt(&self, engine: &::data_privacy::RedactionEngine, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
fn fmt(&self, engine: &dyn ::data_privacy::Redactor, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
match self {
#(#variant_matches => ::data_privacy::RedactedDisplay::fmt(template_variant, engine, f)?),*
}
Comment on lines +10 to +19
struct PassthroughRedactor;

impl Redactor for PassthroughRedactor {
fn would_redact(&self, _data_class: &DataClass) -> bool {
true
}

fn redact(&self, _data_class: &DataClass, _value: &str, output: &mut dyn Write) -> core::fmt::Result {
write!(output, "[REDACTED]")
}
Comment thread crates/data_privacy/tests/sensitive.rs
Comment thread crates/data_privacy_core/src/sensitive.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants