Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 31 additions & 6 deletions conserver/links/jq_link/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,18 +101,43 @@ The link accepts two configuration options:
- `filter` (string): A jq expression that evaluates to a boolean
- `forward_matches` (boolean): If true, forwards vCons that match the filter. If false, forwards vCons that don't match.

## Mixed-Type `body` Arrays

Some legacy vCons contain attachment or analysis `body` arrays with mixed value
types, such as strings, numbers, or objects in the same array. jq string
functions like `startswith()` will raise when they are evaluated against a
non-string item.

`jq_link` now retries once with string-only `body` arrays when jq raises a
string-input type error. This keeps common tag-scanning filters working without
changing normal jq behavior for valid inputs.

For new filters, prefer defensive jq patterns that explicitly keep strings:

```jq
.attachments[0]
| select(.body[] | strings | startswith("call_type:") and . != "call_type:2")
```

or:

```jq
.attachments[0]
| select(any(.body[]; type == "string" and startswith("call_type:") and . != "call_type:2"))
```

### Example Chain Configuration

```yaml
links:
filter_cats:
module: "links.jq_filter"
module: "links.jq_link"
options:
filter: '.attributes.arc_display_type == "Cat"'
forward_matches: true

filter_no_analysis:
module: "links.jq_filter"
module: "links.jq_link"
options:
filter: '.analysis | length == 0'
forward_matches: false
Expand Down Expand Up @@ -220,7 +245,7 @@ To debug filter behavior:
```yaml
links:
filter_unredacted:
module: "links.jq_filter"
module: "links.jq_link"
options:
filter: ".redacted == {}"
forward_matches: true
Expand All @@ -231,7 +256,7 @@ links:
```yaml
links:
filter_with_analysis:
module: "links.jq_filter"
module: "links.jq_link"
options:
filter: ".analysis | length == 0"
forward_matches: false
Expand All @@ -242,7 +267,7 @@ links:
```yaml
links:
filter_pattern:
module: "links.jq_filter"
module: "links.jq_link"
options:
filter: '.meta.serial_number | test("^ABC\\d{5}$")'
forward_matches: true
Expand All @@ -253,7 +278,7 @@ links:
### Running Tests

```bash
python -m pytest tests/links/test_jq_filter.py -v
uv run --group conserver --group dev pytest conserver/links/jq_link/test_jq_link.py -v
```

### Contributing
Expand Down
41 changes: 40 additions & 1 deletion conserver/links/jq_link/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,34 @@
"forward_matches": True,
}


def _filter_body_arrays_to_strings(value):
"""Drop non-string items from ``body`` arrays for string-based jq filters.

Legacy vCons can still carry mixed-type attachment/analysis bodies. jq
string functions like ``startswith()`` raise when they hit an int/dict in
``.body[]``. For those cases, retrying with string-only body arrays preserves
the common "scan tags in body" use case without changing the first-pass
semantics for valid filters.
"""
if isinstance(value, dict):
sanitized = {}
for key, child in value.items():
if key == "body" and isinstance(child, list):
sanitized[key] = [item for item in child if isinstance(item, str)]
else:
sanitized[key] = _filter_body_arrays_to_strings(child)
return sanitized

if isinstance(value, list):
return [_filter_body_arrays_to_strings(item) for item in value]

return value


def _is_string_input_type_error(error):
return "requires string inputs" in str(error)

def run(vcon_uuid, link_name, opts=default_options):
"""JQ Filter link that uses jq expressions to filter vCons.

Expand Down Expand Up @@ -49,7 +77,18 @@ def run(vcon_uuid, link_name, opts=default_options):
# Compile and run the jq program
logger.debug(f"Applying jq filter '{opts['filter']}' to vCon {vcon_uuid}")
program = jq.compile(opts["filter"])
results = list(program.input(vcon_dict))
try:
results = list(program.input(vcon_dict))
except Exception as runtime_error:
if not _is_string_input_type_error(runtime_error):
raise

increment_counter("conserver.link.jq.string_body_array_retries", attributes=attrs)
logger.warning(
f"Retrying jq filter '{opts['filter']}' for vCon {vcon_uuid} "
f"with string-only body arrays after type error: {runtime_error}"
)
results = list(program.input(_filter_body_arrays_to_strings(vcon_dict)))

# Handle empty results
if not results:
Expand Down
12 changes: 12 additions & 0 deletions conserver/links/jq_link/test_jq_link.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,18 @@ def test_filter_by_attachments(mock_redis_with_vcon, sample_vcon):
result = run("test-uuid", "test-link", opts)
assert result == "test-uuid"


def test_filter_by_attachments_with_mixed_body_types(mock_redis_with_vcon, sample_vcon):
"""String-based filters should tolerate mixed-type body arrays."""
sample_vcon.vcon_dict["attachments"][0]["body"] = ["call_type:1", 2, "call_type:2", {"k": "v"}]

opts = {
"filter": '.attachments[0] | select(.body[] | startswith("call_type:") and . != "call_type:2")',
"forward_matches": True,
}
result = run("test-uuid", "test-link", opts)
assert result == "test-uuid"

# Test specific attachment type
opts = {
"filter": '.attachments[] | select(.type == "report") | any',
Expand Down
69 changes: 69 additions & 0 deletions docs/reference/links/jq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# jq_link

Filters vCons with jq expressions.

## Configuration

```yaml
links:
jq_filter:
module: links.jq_link
options:
filter: '.attachments | length > 0'
forward_matches: true
```

## Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `filter` | string | `"."` | jq expression evaluated against the vCon |
| `forward_matches` | boolean | `true` | Forward matches when `true`, or forward non-matches when `false` |

## Examples

```yaml
links:
has_analysis:
module: links.jq_link
options:
filter: '.analysis | length > 0'
forward_matches: true

skip_empty_analysis:
module: links.jq_link
options:
filter: '.analysis | length == 0'
forward_matches: false
```

## Mixed-Type `body` Arrays

Some legacy vCons contain mixed values inside attachment or analysis `body`
arrays. For example, a tags-like array may contain strings plus integers or
objects. Raw jq string functions such as `startswith()` fail on non-string
inputs.

`jq_link` now retries once with string-only `body` arrays when jq raises a
string-input type error. That hardens common filters that scan `body[]` values
for tag prefixes.

For new filters, prefer explicit string guards:

```jq
.attachments[0]
| select(.body[] | strings | startswith("call_type:") and . != "call_type:2")
```

or:

```jq
.attachments[0]
| select(any(.body[]; type == "string" and startswith("call_type:") and . != "call_type:2"))
```

## Behavior

If the jq expression returns a truthy first result, the link treats the vCon as
a match. Invalid filters or runtime jq failures are logged and the vCon is
filtered out.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ nav:
- Deepgram: reference/links/deepgram.md
- Analyze: reference/links/analyze.md
- Tag: reference/links/tag.md
- JQ: reference/links/jq.md
- Webhook: reference/links/webhook.md
- Storage Adapters:
- PostgreSQL: reference/storage-adapters/postgres.md
Expand Down
Loading