Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
188 changes: 162 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,29 @@
[![Node CI](https://github.com/ioncache/data-sanitization/actions/workflows/ci.yml/badge.svg)](https://github.com/ioncache/data-sanitization/actions/workflows/ci.yml)
[![Coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/ioncache/e2afdd1c4942b8c99362ceb3853a331e/raw/coverage.json)](https://gist.github.com/ioncache/e2afdd1c4942b8c99362ceb3853a331e)

Pattern-based sanitization for sensitive data in objects and strings. Masks or removes fields matching configurable patterns, making data safe for logging or external exposure.
Pattern-based sanitization for sensitive data in objects and strings. Use it to
mask or remove fields before logging, debugging, or sending data to systems that
should not receive sensitive values such as secrets, PII, PHI, credentials, or
other private data.

Works with both JavaScript and TypeScript — ships with compiled JS, TypeScript declarations, and source maps.
Works with JavaScript and TypeScript. The package ships compiled JavaScript,
TypeScript declarations, and source maps.

## Table of Contents

- [data-sanitization](#data-sanitization)
- [Table of Contents](#table-of-contents)
- [Installation](#installation)
- [npm](#npm)
- [Yarn](#yarn)
- [pnpm](#pnpm)
- [Bun](#bun)
- [Importing](#importing)
- [Usage](#usage)
- [Sanitize an object](#sanitize-an-object)
- [Quick start](#quick-start)
- [Sanitize a string](#sanitize-a-string)
- [Remove fields instead of masking](#remove-fields-instead-of-masking)
- [Sanitize PII and PHI with custom patterns](#sanitize-pii-and-phi-with-custom-patterns)
- [Options](#options)
- [Default patterns](#default-patterns)
- [Default matchers](#default-matchers)
Expand All @@ -27,17 +37,55 @@ Works with both JavaScript and TypeScript — ships with compiled JS, TypeScript

## Installation

Install with the package manager used by your project.

### npm

```bash
npm install data-sanitization
```

### Yarn

```bash
yarn add data-sanitization
```

### pnpm

```bash
pnpm add data-sanitization
```

### Bun

```bash
bun add data-sanitization
```

## Importing

The named export is recommended:

```typescript
import { sanitizeData, DataSanitizationError } from 'data-sanitization';
```

The sanitizer is also available as the default export:

```typescript
import sanitizeData from 'data-sanitization';
```

CommonJS consumers can require the compiled package:

```javascript
const { sanitizeData } = require('data-sanitization');
```

## Usage

### Sanitize an object
### Quick start

```typescript
import { sanitizeData } from 'data-sanitization';
Expand All @@ -54,7 +102,8 @@ const result = sanitizeData(input);

### Sanitize a string

Works with JSON strings and form-encoded strings:
String sanitization works with JSON-like strings, escaped JSON-like strings, and
form-encoded strings:

```typescript
sanitizeData('{"password":"secret","username":"mark"}');
Expand All @@ -74,20 +123,80 @@ sanitizeData(
// => { username: 'mark' }
```

### Sanitize PII and PHI with custom patterns

Use `customPatterns` to mask fields that are sensitive for your domain, such as
PII or PHI fields.

```typescript
import { sanitizeData } from 'data-sanitization';

const sensitivePatterns = [
'address',
'date_of_birth',
'email',
'emergency_contact',
'full_name',
'health_card',
'ip_address',
'medications',
'phone',
'postal_code',
'ssn',
];

const patient = {
accountId: 'acct_123',
full_name: 'Avery Example',
email: 'avery@example.com',
phone: '+1-555-0100',
date_of_birth: '1989-04-12',
health_card: 'HC-1234-5678',
medications: ['example-medication'],
};

sanitizeData(patient, {
customPatterns: sensitivePatterns,
useDefaultPatterns: false,
});
// => {
// accountId: 'acct_123',
// full_name: '**********',
// email: '**********',
// phone: '**********',
// date_of_birth: '**********',
// health_card: '**********',
// medications: '**********',
// }
```

Use `removeMatches` with the same patterns to remove those fields instead of
masking them.

```typescript
sanitizeData(patient, {
customPatterns: sensitivePatterns,
useDefaultPatterns: false,
removeMatches: true,
});
// => { accountId: 'acct_123' }
```

## Options

| Option | Type | Default | Description |
| -------------------- | --------------------------- | ------------ | ------------------------------------------------- |
| `patternMask` | `string` | `**********` | String used to replace matched field values |
| `removeMatches` | `boolean` | `false` | Remove matched fields entirely instead of masking |
| `customPatterns` | `string[]` | | Additional field name patterns to match |
| `customMatchers` | `DataSanitizationMatcher[]` | | Additional regex matchers for custom data formats |
| `useDefaultPatterns` | `boolean` | `true` | Whether to include the built-in default patterns |
| `useDefaultMatchers` | `boolean` | `true` | Whether to include the built-in default matchers |
| Option | Type | Default | Description |
| -------------------- | --------------------------- | ------------ | --------------------------------------------------- |
| `patternMask` | `string` | `**********` | String used to replace matched field values |
| `removeMatches` | `boolean` | `false` | Remove matched fields entirely instead of masking |
| `customPatterns` | `string[]` | `[]` | Additional field name patterns to match |
| `customMatchers` | `DataSanitizationMatcher[]` | `[]` | Additional regex matchers for custom string formats |
| `useDefaultPatterns` | `boolean` | `true` | Whether to include the built-in default patterns |
| `useDefaultMatchers` | `boolean` | `true` | Whether to include the built-in default matchers |

## Default patterns

The following field name patterns are matched by default (case-insensitive, substring match):
The following field name patterns are matched by default using a
case-insensitive substring match:

- `apikey`
- `api_key`
Expand All @@ -102,27 +211,33 @@ these patterns match as substrings.

Three matchers are included by default:

- **JSON matcher** — matches `"fieldName":"value"` patterns in JSON and JSON-like strings
- **Escaped JSON matcher** — matches `\"fieldName\":\"value\"` patterns in JSON embedded inside JSON string values
- **Form-encoded matcher** — matches `fieldName=value` and `fieldName:value` patterns in URL-encoded and similarly delimited strings
- **JSON matcher** — matches `"fieldName":"value"` patterns in JSON and
JSON-like strings
- **Escaped JSON matcher** — matches `\"fieldName\":\"value\"` patterns in
JSON embedded inside JSON string values
- **Form-encoded matcher** — matches `fieldName=value` and `fieldName:value`
patterns in URL-encoded and similarly delimited strings

## Custom patterns and matchers

```typescript
import { sanitizeData } from 'data-sanitization';

// Add a custom pattern alongside defaults
const data = {
username: 'mark',
ssn: '123-45-6789',
credit_card: '4111111111111111',
};

sanitizeData(data, {
customPatterns: ['ssn', 'credit_card'],
});

// Use only custom patterns, no defaults
sanitizeData(data, {
customPatterns: ['ssn'],
useDefaultPatterns: false,
});

// Use a custom mask
sanitizeData(data, {
patternMask: '[REDACTED]',
});
Expand All @@ -133,12 +248,24 @@ takes a pattern string and returns a global, case-insensitive `RegExp`. The
regex must use capture groups `$1` and `$2` to preserve the field name and
trailing delimiter while replacing the value.

```typescript
const headerMatcher = (pattern: string) =>
new RegExp(`(${pattern}:\\s*).+?(\\n|$)`, 'gi');

sanitizeData('authorization: Bearer abc123\nuser: mark', {
customPatterns: ['authorization'],
customMatchers: [headerMatcher],
useDefaultMatchers: false,
});
// => 'authorization: **********\nuser: mark'
```

## Error handling

`sanitizeData` throws a `DataSanitizationError` when:

- The input is not a `string` or `object` (e.g., `number`, `boolean`, `undefined`)
- An unexpected error occurs during sanitization (e.g., malformed JSON that cannot be re-parsed)
- The input is not a `string`, `object`, or `null`.
- An unexpected error occurs during sanitization.

```typescript
import { sanitizeData, DataSanitizationError } from 'data-sanitization';
Expand All @@ -159,13 +286,22 @@ original input payload.
## How it works

1. **String input** is sanitized directly via regex replacement with the configured matchers.
2. **Object input** is sanitized recursively by key name without JSON serialization. Sensitive keys are masked or removed regardless of whether their values are strings, numbers, arrays, objects, or other primitives.
3. **Plain nested objects and arrays** are cloned as they are sanitized. Non-plain object instances are preserved without modification to avoid corrupting their prototypes.
4. Each configured pattern is matched case-insensitively against object keys. For string input, each configured pattern is tested against each matcher to produce regex instances that find and replace sensitive field values.
2. **Object input** is sanitized recursively by key name without JSON
serialization. Sensitive keys are masked or removed regardless of whether
their values are strings, numbers, arrays, objects, or other primitives.
3. **Plain nested objects and arrays** are cloned as they are sanitized.
Non-plain object instances are preserved without modification to avoid
corrupting their prototypes.
4. **Null input** is accepted and returns `null`.
5. Each configured pattern is matched case-insensitively against object keys.
For string input, each configured pattern is tested against each matcher to
produce regex instances that find and replace sensitive field values.

## Contributing

For development setup, testing, and release process, see [docs/development.md](docs/development.md).
For development setup, testing, and release process, see
[docs/development.md](docs/development.md). For future direction, see
[docs/ROADMAP.md](docs/ROADMAP.md).

## License

Expand Down
4 changes: 2 additions & 2 deletions docs/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ on configurable patterns; ships TypeScript declarations; and avoids exposing
input payloads in sanitizer error details.

The project should continue to prioritize a small public API, predictable
behavior, safe logging use cases, and low-friction adoption in JavaScript and
TypeScript projects.
behavior, sensitive-data sanitization for logging and debugging workflows, and
low-friction adoption in JavaScript and TypeScript projects.

## Near-Term v1.x Work

Expand Down
20 changes: 17 additions & 3 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,18 @@

## Setup

This repository uses Yarn and Husky hooks.
This repository uses Yarn, Husky hooks, and Volta-pinned tool versions. Install
Volta or use compatible local versions of Node and Yarn before installing
dependencies.

```bash
yarn install
```

Common commands:
Common package scripts:

```bash
yarn build
yarn format
yarn format:check
yarn lint
Expand All @@ -26,7 +29,9 @@ Build artifacts are emitted to `dist/`:
yarn build
```

`prepack` runs the build automatically to ensure published packages use compiled output.
The build emits compiled JavaScript, TypeScript declarations, and source maps.
`prepack` runs the build automatically to ensure published packages use compiled
output.

## Testing

Expand Down Expand Up @@ -93,6 +98,15 @@ yarn release --bump patch

Supported bump values: `major`, `minor`, `patch`.

Before publishing or cutting a release, run the local validation scripts:

```bash
yarn format:check
yarn lint
yarn build
yarn test:coverage
```

Live release behavior:

1. Generates release notes from conventional commits.
Expand Down
3 changes: 2 additions & 1 deletion docs/plans/001-coverage-tracking.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ accounts are required beyond GitHub.
## Pre-implementation

- Create a public GitHub Gist with a file named `coverage.json`
- Create a classic PAT with `gist` scope at https://github.com/settings/tokens
- Create a classic PAT with `gist` scope at
[github.com/settings/tokens](https://github.com/settings/tokens)
- Add the PAT as repository secret `GIST_SECRET`
- Add the Gist ID as repository variable `COVERAGE_GIST_ID`
- Create GitHub issue #274
Expand Down
Loading
Loading